Docker Build `--replace`

| 5 min read

This article covers "docker build replace", a script that I use in projects that contain Dockerfiles, which aims to help overcome some of the main drawbacks I encounter when using Dockerfiles in a project.

The docker build command is great for helping to achieve reproducible builds for projects, where in the past developers had to rely on setting up the correct environment manually in order to get a successful build. One big drawback of docker build, however, is that it can be very costly in terms of storage when running it multiple times, as each run of the command will generally leave unnamed images around. Cleanup can be straightforward, but requires continual pruning.

The need to remove unused images is particularly felt when trying to develop and debug Dockerfiles. Trying to come up with a minimal set of instructions that will allow you to run your processes the way that you want can require several docker build runs, even after you've narrowed down the scope with an interactive docker run session. Such a sequence may well require a few Docker image purges over the course of a session as your disk is continually overbooked by old and redundant images. This is compounded further if your Docker image makes use of a command such as COPY . /src, where each change to your root project will require a new image build.

This is where docker build --replace comes in, where Docker automatically removes the old image with the same tag when a new copy is built, and skips the build entirely if it's up-to-date. The only problem is that this flag doesn't currently exist.

docker_rbuild.sh

I wrote docker_rbuild.sh ("Docker replace build") to approximate the idea of docker build --replace by making use of the build cache:

# `$0 <img-name> <tag>` builds a docker image that replaces the docker image
# `<img-name>:<tag>`, or creates it if it doesn't already exist.
#
# This script uses `<img-name>:cached` as a temporary tag and so may clobber
# such existing images if present.

if [ $# -lt 2 ] ; then
echo "usage: $0 <img-name> <tag> ..." >&2
exit 1
fi

img_name="$1"
shift
tag="$1"
shift

docker tag "$img_name:$tag" "$img_name:cached" &>/dev/null
if docker build --tag="$img_name:$tag" "$@" ; then
docker rmi "$img_name:cached" &>/dev/null
# We return a success code in case `rmi` failed.
true
else
exit_code=$?
docker tag "$img_name:cached" "$img_name:$tag" &>/dev/null
exit $exit_code
fi

This tags the current copy of the image so that it can be reused for caching purposes, and then kicks off a new build. If the build was successful then the "cache" version is removed, theoretically meaning that only the latest copy of the image you're working on should be present in your system. If the build fails then the old tag is restored. If there are no updates then the cached layers are used to create a "new" image almost instantly to replace the old one.

With this, local images are automatically "pruned" as newer copies are produced, saving time and disk space.

Idempotency

One benefit of docker_rbuild.sh is the fact that, now that docker build isn't leaving redundant images around with each build, it is more practicable to use it in scripts to rebuild our images before we run them. This is useful when a project defines local images so that we can rebuild the image before it's used, every time that it's used, so that we're always using the latest version of the image without having to manually update it.

An example of where this can be convenient is when you want to use an external program or project that uses a language that isn't supported by your project. For example, the build process for this blog's content uses Node.js, but consider the case where I wanted to use a Markdown linter defined in Ruby, such as Markdownlint. One option is to add a Ruby installation directly to the definition of the build environment, but this has a few disadvantages:

  • It adds an installation for a full new language to the build environment just to support the running of one program.
  • It isn't clear, at a glance, that Ruby is only being installed to support one tool, and to someone new to the project it can look like the project is a combined Node.js/Ruby project.
  • The above point lends itself to using more Ruby gems "just because" it's available, meaning that removing the Ruby installation later becomes more difficult.

One way to work around this is to encapsulate the usage with a Dockerfile, like markdownlint.Dockerfile, and a script that runs the tool:

markdownlint.Dockerfile:

FROM ruby:3.0.0-alpine3.13

RUN gem install mdl

ENTRYPOINT ["mdl"]

markdownlint.sh:

if [ $# -ne 1 ] ; then
echo "usage: $0 <md-file>" >&2
exit 1
fi
md_file="$1"

proj='ezanmoto/hello'
sub_img_name="markdownlint"
sub_img="$proj.$sub_img_name"

docker run \
--rm \
--volume="$(pwd):/app" \
--workdir='/app' \
"$sub_img" \
"$md_file"

This addresses some of the above issues:

  • Ruby isn't installed directly into the build environment, meaning that the build environment is kept focused and lean.
  • In markdownlint.Dockerfile, the Ruby installation is kept with the program that it's used to run, making the association clear.
  • The entire Ruby installation can be removed easily by deleting markdownlint.Dockerfile. This can be useful if we decide to replace the tool with a different linter, like this one written for Node.js. Another reason why we might remove markdownlint.Dockerfile is if the external project starts maintaining its own public Docker image that can be used instead of managing a local version.

Despite the benefits, there are two subtle issues with this setup. The first is that ezanmoto/blog_content.markdownlint will need to be built somehow before markdownlint.sh can be run, which may be a manual process, and it would also be a surprising error to find out that an image is missing for a script.

The second issue is that if one developer builds the local image, and a second developer updates the image definition, the first developer will need to rebuild their copy of the local image before running markdownlint.sh again or risk unexpected results.

We can solve both of these issues by running docker_rbuild.sh before running ezanmoto/blog_content.markdownlint:

markdownlint.sh:

bash scripts/docker_rbuild.sh \
"$sub_img" \
'latest' \
--file="$sub_img_name.Dockerfile" \
.

docker run \
--rm \
--volume="$(pwd):/app" \
--workdir='/app' \
"$sub_img" \
"$md_file"

This causes the image to be always be rebuilt before it's used, meaning that we're always working with the latest version of the image, and this build step will most often be skipped due to caching (though attention should be paid to the commands used in the image build, as the use of commands like COPY can limit the effectiveness of the cache).

Use With docker-compose

I find docker-compose particularly useful for modelling deployments. However, like developing Docker images, getting the docker-compose environment correct can require continual fine-tuning of Docker images, especially for defining minimal environments. This can again result in lots of wasted space, especially when used with docker-compose up --build.

With that in mind, I now remove the build property from services defined in docker-compose.yml. This then requires the images to be built before docker-compose is called, which I normally handle in a script that will build all of the images used in the docker-compose.yml file before the file is called:

docker-compose.yml:

version: '2.4'

services:
hello.seankelleher.local:
image: nginx:1.19.7-alpine
ports:
- 8080:8080
volumes:
- ./configs/hello.conf:/etc/nginx/conf.d/hello.conf:ro

hello:
image: ezanmoto/hello

scripts/docker_compose_up_build.sh:

set -o errexit

proj='ezanmoto/hello'
run_img="$proj"

bash scripts/docker_rbuild.sh \
"$run_img" \
'latest' \
.

docker-compose up "$@"

Conclusion

Having an idempotent rebuild for Docker images means that it's more feasible to rebuild before each run, much in the same way that some build tools (e.g. cargo) update any changed dependencies before attempting to rebuild the codebase. While Docker doesn't have native support for this at present, a script that takes advantage of the cache can be used to simulate such behaviour.