The Challenges of Uniquely Identifying Your Images
One of the challenges of container security is ensuring that the image you’re getting is exactly what you expect it to be. Both from a security and consistency perspective, it’s important to ensure there are no surprises in what you’re downloading. Docker image tags, whilst convenient, can’t always be relied on to point to a consistent specific image, so a common piece of advice is to use SHA-256 hashes to identify your images. However, that’s not always as easy as it might seem.
Comparing Docker Hub and local hashes
Looking at Docker Hub for a sample image, it seems like this should be quite a straightforward task. As an example, navigating to the “centos7” tag on the popular Jenkins/Jenkins image shows the following:
And we can see a hash there (8d28034275002fb438766e90b95e9c0f99a7568b8c5645949c1ba25045ed63ce) which we can use to pull this specific image, safe in the knowledge that it’ll be the same image every time we pull it.
After pulling the image using:
|docker pull jenkins/jenkins@sha256:8d28034275002fb438766e90b95e9c0f99a7568b8c5645949c1ba25045ed63ce|
We start to see where the problems might lie. Say we want to check that the image we’ve got locally in a Docker install is the same one as we’ve checked in Docker Hub. The logical thought would be to confirm the hash is the same as what we pulled. However, what we get back is something like this:
docker images jenkins/jenkins
REPOSITORY TAG IMAGE ID CREATED SIZE
jenkins/jenkins <none> bac2f5fed373 6 days ago 693MB
Looking at the Image ID, we can see that it’s entirely different to the ID that we pulled. Also, if we try to start a container using the image hash from Docker Hub, we’ll find that doesn’t work either:
docker run -it 8d28034275002fb438766e90b95e9c0f99a7568b8c5645949c1ba25045ed63ce /bin/sh
docker: Error response from daemon: No such image: sha256:8d28034275002fb438766e90b95e9c0f99a7568b8c5645949c1ba25045ed63ce
Behind the scenes
So why do we see this kind of discrepancy? Like many things in container land, probably the best place to go looking for answers is GitHub issues. This Docker Hub issue has some good history on the discrepancies between the Hash displayed on Docker Hub and locally, and this post about local images ID not matching registry manifest digest goes into some more thinking from the Docker side of things.
The essence of the issue is that the hash used in the Docker registry is a manifest digest and this doesn’t work off the same information as the local ImageID digest, so they won’t match.
You can actually see where the local ImageID comes from by looking at the contents of our sample image.
Using this command, we can create a tar file of the specific Image ID we want to look at
docker save bac2f5fed373 -o jenkins.tar
Then looking at the list of files, you’ll see a file called bac2f5fed3730d7c6857f47f773e52ac3bada754e2a769f1c088c92e04015a4d.json which is the full SHA-256 hash of the image ID. In a somewhat self-referential style, this filename is actually the hash of the file contents, which we can check with the sha256sum command:
Addressing the issue
So now we know about the problem, what can we do about it? At a practical level, there are ways we can check a local Docker image to see what the Registry hash is, as this information is stored in the image manifest. Running the following command:
docker inspect [IMAGE_NAME] --format=''
will return the repository hashes that correspond to that image ID.
Whilst this approach is a useful tactic, in general just checking image IDs is not an ideal way to manage image integrity and provenance as it doesn’t address the risk that the image has been tampered with after its creation, but before you first pulled it to a local Docker instance. For that we need another approach.
Having cryptographic signatures of container images is a better approach, at a high level, to addressing the problem of proving image provenance and integrity. There has been support for Docker image signing via Notary for some time, however, uptake for this has been fairly limited due to challenges around the ease of implementation and use in common deployment scenarios. As a result of this, the Notary version 2 project has been kicked off to provide a new version with an approach designed to take account of learnings from the first project.
There’s also another new image signing tool which is rapidly developing into a useful option to address this issue. The cosign project (which is part of the overall sigstore initiative) can be used to sign container images as well as other artifacts. While it’s still early days for this project, it already provides an easy-to-use signing tool which works well with common container image registries.
Image hashes are a useful feature of container runtimes and can be helpful to uniquely identify an image, but there are some nuances to be aware of when using them. If you need strong controls over image provenance, container image signing is the way to go, or if you’re an Aqua customer, image fingerprinting and dev-to-prod integrity assurance are built into the product.