Why StableBuild?
Docker containers are now the standard for packaging web applications, offering a clean standardized method to include all your dependencies. However, they create a false sense of security. At first glance Dockerfiles look deterministic (the same Dockerfile creates the same Docker container), but in reality you're depending on a myriad of mirrors, package registries, and repositories - any of which can change at any moment. While it seems that you've packaged all your dependencies neatly in a container, that dependency list can change at any moment - breaking your application.
A typical Dockerfile
Let's take a look at a typical Dockerfile (this installs a Python application, but the same applies to other programming languages):
Let's run through this Dockerfile line by line:
Here you pull the ubuntu
base image with tag 20.04
from a Docker registry (most likely Docker Hub). Images and tags in Docker Hub are not immutable; and can be overwritten (here, every time a new Ubuntu 20.04 version is released) or even deleted. Which base image you get back depends on when you build and what's in your build cache. This makes it really hard to debug issues (you'll run a different base image than what's in production), your base OS version might all of a sudden change when you rebuild the container (e.g. if your build cache is empty), or the base image might be deleted and your container won't build at all.
Next you install some packages from the Ubuntu package registry. This will actually install the latest versions of these packages, as the Ubuntu package registry is constantly updating. It's also impossible to pin to specific versions, as the registry actively deletes older versions. This means that core dependencies of your container might suddenly update or are no longer available. Like above, this means that the packages you've installed depend on when you've built the container (and on what was in your build cache) - and your build can break at any time.
Next you need a package that is not in the Ubuntu package registry (Python 3.9); so you add the deadsnakes PPA (an alternative package repository), and install the package from there. This has the exact same issues as the previous step, but you now also have an extra dependency on the PPA; which might go offline, or stop working at any time.
Then we need some extra dependency that is not in a package repository. You download the get-pip.py
file and then run it to install the Pip package manager. The contents of this URL can (and will!) change. The URL might resolve to a different script, or the URL might even be removed. And the script itself most likely makes additional HTTP calls to download extra resources. What version of pip you'll install in the container thus again depends on when you build this container; and your application can break anytime an update is pushed.
Last you'll install a Python package. It looks like you're pinning this package to the exact 1.14.0 release. However, that's not true, because onnx itself depends on unpinned dependencies (e.g. it specifies protobuf>=3.20.2
). When one of the sub-dependencies of onnx is updated (e.g. protobuf 5 is released with a breaking change) and the container is rebuilt; your application breaks. Because pip just takes the latest versions of these sub-dependencies you'll again have the problem that you might have wildly different Python packages installed depending on when you built the container (and your build cache).
One Dockerfile, many things that can break
As you've seen this very simple Dockerfile depends on 5 services (Docker Hub, Ubuntu package registry, deadsnakes PPA, pypa.io and the PyPi registry) that are all unstable. Any dependency can be updated, modified or deleted at any time - breaking your application.
In addition, the contents of the container depend on when you built the container, and what was in your build cache. This is problematic because you'll have the same Dockerfile yielding wildly different containers; making it very hard to debug issues that arise because dependencies are updated (works on my machine, but not my coworkers).
Together, this is a massive maintenance burden; especially because it's all reactive. Someone makes a seemingly small change to your codebase, the build server builds the container from scratch, and the build is broken with a completely unrelated error. Now you'll need to immediately go and update your application to work with the new dependency (the build is broken!).
So, StableBuild!
To deal with this madness we've created StableBuild. StableBuild is a set of mirrors and package registries aimed to make building containers reliable and deterministic. In essence this means that you can freeze any dependency using StableBuild, so the same Dockerfile will yield the same Docker container. We currently do this for Docker base images, the complete Ubuntu, Debian and Alpine package registry, the most popular PPAs, and the PyPi Python package registry; and are actively working on more (like freezing arbitrary files on the internet).
Excited? You can learn how to make the Dockerfile above deterministic in Pinning your first container with StableBuild. 🎉
Last updated