How to Make Docker Images Smaller: docker-squash

By richardtylee

docker-squash is a Go app that merges layers to make images smaller. The savings using docker-squash depends on how many layers it can effectively squash, so mileage may vary.

Let's squash richardtylee/railsapp and tag it as squashed:

REPOSITORY                      TAG                             VIRTUAL SIZE
richardtylee/railsapp squashed 1.119 GB
richardtylee/railsapp latest 1.141 GB

So we can save about 22 MB.

One thing we had tried was uninstalling packages we don't use. However, every command in the Dockerfile creates a new layer in the Docker image. So even uninstalling libraries on an image will actually make the image bigger. Let's remove bzr, mercurial and subversion, which SCM systems that we don't use. They come as part of buildpack-deps:scm image, which is part of the ruby:4.1.6 image.

RUN yes | apt-get purge bzr mercurial subversion

And we'll build the image again and tag it as remove-scm.

REPOSITORY                      TAG                  VIRTUAL SIZE
richardtylee/railsapp           remove-scm           1.143 GB
richardtylee/railsapp           latest               1.141 GB

Notice that the image size actually increased by 2 MB even though we removed content. This is the overhead of the additional layer. However, if we uninstall some libraries and then run docker-squash, we will realize the savings of uninstalling those libraries.

REPOSITORY                      TAG                  VIRTUAL SIZE
richardtylee/railsapp           remove-scm-squashed  1.114 GB
richardtylee/railsapp           remove-scm           1.143 GB
richardtylee/railsapp           squashed             1.119 GB

Notice, that not only is richardtylee/railsapp:removed-scm-squashed is smaller that richardtylee/railsapp:removed-scm, but because we removed libraries and squashed the image, the resulting image is smaller than richardtylee/railsapp:squashed also.

We still need to do a comprehensive review of the libraries we do and do not need to get an idea of the savings we can get from this technique.

Some gotchas:

  • Portability. docker-squash is compiled and can only be run on the same type of machine it was compiled on.
  • For Mac, docker-squash depends on gnu-tar to be the default tar command.

Since there is overhead to set-up and compile docker-squash, it might not be that useful to have all developers install it and have it as part of the build process. The recommendation would be to run docker-squash for the base image of each language we support. For example, we should run docker-squash on our base rails image, so a project starts with the smallest image possible.