moby: Docker build cache does not ignore uid/gid on ADD or COPY
The uid/gid of a file (on the build host’s filesystem) seem to have an effect on the hash of a file when it is added through ADD or COPY. This seems superfluous since the ownership of the added file is reset to uid=gid=0 in the Docker image.
This behaviour may lead to the Docker build cache being busted in continuous integration environments, if uid/gid are different across different build nodes.
For a minimum working example, please consider the following example where an image is built twice, the second time with different ownership of a COPY’d file.
Please note
- The hashes of the added file are different as per
docker history
- The ownership is reset to uid=gid=0 inside the image, as stated in the documentation for ADD (“All new files and directories are created with a UID and GID of 0.”)
$ ls -lah
total 8,0K
drwxrwxr-x 1 jprobst jprobst 36 Apr 25 18:08 .
drwxrwxrwx 1 jprobst jprobst 6,4K Apr 25 17:15 ..
-rw-rw-r-- 1 jprobst jprobst 44 Apr 25 18:08 Dockerfile
-rwxrwxrwx 1 jprobst jprobst 5 Apr 25 17:15 test.txt
$ cat Dockerfile
FROM ubuntu:16.04
COPY test.txt /test.txt
$ docker build -t demo .
Sending build context to Docker daemon 3.072kB
Step 1/2 : FROM ubuntu:16.04
---> 104bec311bcd
Step 2/2 : COPY test.txt /test.txt
---> ea497fe942a0
Removing intermediate container 16753a7ab82b
Successfully built ea497fe942a0
$ docker history demo
IMAGE CREATED CREATED BY SIZE COMMENT
ea497fe942a0 17 seconds ago /bin/sh -c #(nop) COPY file:db60da90f31b3b... 5B
104bec311bcd 4 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 4 months ago /bin/sh -c mkdir -p /run/systemd && echo '... 7B
<missing> 4 months ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\... 1.9kB
<missing> 4 months ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0B
<missing> 4 months ago /bin/sh -c set -xe && echo '#!/bin/sh' >... 745B
<missing> 4 months ago /bin/sh -c #(nop) ADD file:7529d28035b43a2... 129MB
$ sudo chown 1001:1001 test.txt
$ ls -lah
total 8,0K
drwxrwxr-x 1 jprobst jprobst 36 Apr 25 18:08 .
drwxrwxrwx 1 jprobst jprobst 6,4K Apr 25 17:15 ..
-rw-rw-r-- 1 jprobst jprobst 44 Apr 25 18:08 Dockerfile
-rwxrwxrwx 1 1001 1001 5 Apr 25 17:15 test.txt
$ docker build -t demo .
Sending build context to Docker daemon 3.072kB
Step 1/2 : FROM ubuntu:16.04
---> 104bec311bcd
Step 2/2 : COPY test.txt /test.txt
---> 43cb126c18bb
Removing intermediate container 4e729efe09bb
Successfully built 43cb126c18bb
$ docker history demo
IMAGE CREATED CREATED BY SIZE COMMENT
43cb126c18bb 14 seconds ago /bin/sh -c #(nop) COPY file:f9b82392428347... 5B
104bec311bcd 4 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 4 months ago /bin/sh -c mkdir -p /run/systemd && echo '... 7B
<missing> 4 months ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\... 1.9kB
<missing> 4 months ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0B
<missing> 4 months ago /bin/sh -c set -xe && echo '#!/bin/sh' >... 745B
<missing> 4 months ago /bin/sh -c #(nop) ADD file:7529d28035b43a2... 129MB
$ docker run -t demo ls -lah /test.txt
-rwxrwxrwx 1 root root 5 Apr 25 15:15 /test.txt
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 12
- Comments: 33 (26 by maintainers)
Links to this issue
Commits related to this issue
- Upgrade CircleCI executor Use ubuntu-1604:201903-01 because: 1. The default executor (ubuntu-1604:201903-01) uses Ubuntu 14.04, which reached EOL and Circle recommends ubuntu-1604:201903-01 (see ... — committed to eksctl-io/eksctl by 2opremio 5 years ago
- Upgrade CircleCI executor Use ubuntu-1604:201903-01 because: 1. The default executor (ubuntu-1604:201903-01) uses Ubuntu 14.04, which reached EOL and Circle recommends ubuntu-1604:201903-01 (see ... — committed to eksctl-io/eksctl by 2opremio 5 years ago
- Upgrade CircleCI executor Use ubuntu-1604:201903-01 because: 1. The default executor (ubuntu-1604:201903-01) uses Ubuntu 14.04, which reached EOL and Circle recommends ubuntu-1604:201903-01 (see ... — committed to eksctl-io/eksctl by 2opremio 5 years ago
- Upgrade CircleCI executor Use ubuntu-1604:201903-01 because: 1. The default executor (ubuntu-1604:201903-01) uses Ubuntu 14.04, which reached EOL and Circle recommends ubuntu-1604:201903-01 (see ... — committed to eksctl-io/eksctl by 2opremio 5 years ago
- Upgrade CircleCI executor Use ubuntu-1604:201903-01 because: 1. The default executor (ubuntu-1604:201903-01) uses Ubuntu 14.04, which reached EOL and Circle recommends ubuntu-1604:201903-01 (see ... — committed to eksctl-io/eksctl by 2opremio 5 years ago
- Upgrade CircleCI executor Use ubuntu-1604:201903-01 because: 1. The default executor (ubuntu-1604:201903-01) uses Ubuntu 14.04, which reached EOL and Circle recommends ubuntu-1604:201903-01 (see ... — committed to eksctl-io/eksctl by 2opremio 5 years ago
I spent an entire day on this same problem too, so here are some notes for the next person who finds themselves looking at this bug with the same scenario in mind.
Our centralised builders (Kubernetes buildkit pods) would consistently hash/cache ADDs one way when our CI system (GitlabCI) initiated builds, but another way (consistently, but different to GitlabCI builds) when I initiated builds from local machines. This was while using the exact same git commit, the same buildkit builder instance, and I even ran the docker client/buildx/bake in a common docker container, in an attempt to isolate/exclude any host-machine differences.
I still experienced a cache break each time I alternated between CI and local builds.
My solution was to add run these two commands prior to the
docker buildx bake
command, on each run:I learned that the only part of file permissions which git actually saves is the “execute bit”. When git creates files, it uses the operating systems
umask
to default all of the other octects in the permission mask.On our CI system, the git checkout created files with “execute on”, into file system with a 777 mask, whereas my local machine was making 775 files. Both are “executable” as far as git is concerned, but when Docker hashes each files during ADD operations, the difference in mask was triggering cache breaks.
By “normalising” the permissions with the commands above, I now get consistent caching across both CI and local environments. It’s true that this isn’t a “Docker bug”, but WOW was this tricky to track down! It would be really helpful if some kind of debug mode would print why the hashing implementation decided that there was a miss.
I hope this helps someone else 😃
And yes… also it would be really great to be able to have some diagnose/logging feature to enable on demand when you build an image and you can’t figure out why your cache gets invalidated and you don’t know why: