moby: Buildkit: "docker build --cache-from" doesn't use cache from local tagged images (multi-stage builds)

I’m using docker build with --cache-from on a multi-stage build to allow caching in a gitlab-ci (docker in docker) environment. Notably, this means that the images named in the --cache-from option may not exist in the registry when docker build is run, as it might the first build on a new branch or repo fork.

This worked in the classic docker builder (cache could load local tagged images not pushed to a registry), but doesn’t work with buildkit.

Steps to reproduce the issue:

  1. Create a simple Dockerfile, e.g.
FROM alpine:3.9 AS stage1
RUN touch /file1
FROM stage1 AS stage2
RUN touch /file2
  1. Ensure no image exists matching the --cache-from
docker rm local.registry/docker-test/stage1:latest local.registry/docker-test/stage2:latest
  1. Run the docker build command with --cache-from and --tag
docker build \
    --cache-from local.registry/docker-test/stage1:latest \
    --target stage1 \
    --tag local.registry/docker-test/stage1:latest \
    -f Dockerfile .
docker build \
    --cache-from local.registry/docker-test/stage1:latest \
    --cache-from local.registry/docker-test/stage2:latest \
    --target stage2 \
    --tag local.registry/docker-test/stage2:latest \
    -f Dockerfile .

Describe the results you received:

The first stage is not used as a cache when building the second stage:

$ DOCKER_BUILDKIT=1 docker build --cache-from local.registry/docker-test/stage1:latest --target stage1 --tag local.registry/docker-test/stage1:latest -f Dockerfile .
[+] Building 1.2s (7/7) FINISHED                                                                                                                                      
 => [internal] load build definition from Dockerfile                                                                                                             0.2s
 => => transferring dockerfile: 185B                                                                                                                             0.0s
 => [internal] load .dockerignore                                                                                                                                0.2s
 => => transferring context: 2B                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/alpine:3.9                                                                                                    0.0s
 => ERROR importing cache manifest from local.registry/docker-test/stage1:latest                                                                                 0.0s
 => [stage1 1/2] FROM docker.io/library/alpine:3.9                                                                                                               0.0s
 => => resolve docker.io/library/alpine:3.9                                                                                                                      0.0s
 => [stage1 2/2] RUN touch /file1                                                                                                                                0.8s
 => exporting to image                                                                                                                                           0.1s
 => => exporting layers                                                                                                                                          0.1s
 => => writing image sha256:674f0a872a085bfbc17e5e8b094d103d9ce23d6d04c8ce3ba4eb980bb3a9846e                                                                     0.0s
 => => naming to local.registry/docker-test/stage1:latest                                                                                                        0.0s
------
 > importing cache manifest from local.registry/docker-test/stage1:latest:
------

$ DOCKER_BUILDKIT=1 docker build --cache-from local.registry/docker-test/stage1:latest --cache-from local.registry/docker-test/stage2:latest --target stage2 --tag local.registry/docker-test/stage2:latest -f Dockerfile .
[+] Building 1.9s (9/9) FINISHED                                                                                                                                      
 => [internal] load build definition from Dockerfile                                                                                                             0.2s
 => => transferring dockerfile: 96B                                                                                                                              0.0s
 => [internal] load .dockerignore                                                                                                                                0.2s
 => => transferring context: 2B                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/alpine:3.9                                                                                                    0.0s
 => ERROR importing cache manifest from local.registry/docker-test/stage1:latest                                                                                 0.0s
 => ERROR importing cache manifest from local.registry/docker-test/stage2:latest                                                                                 0.0s
 => [stage1 1/2] FROM docker.io/library/alpine:3.9                                                                                                               0.0s
 => => resolve docker.io/library/alpine:3.9                                                                                                                      0.0s
 => [stage1 2/2] RUN touch /file1                                                                                                                                0.7s
 => [stage2 1/1] RUN touch /file2                                                                                                                                0.8s
 => exporting to image                                                                                                                                           0.1s
 => => exporting layers                                                                                                                                          0.1s
 => => writing image sha256:f6526297588e4c4365b520dbd5d382e22a20bc8cb56fef1f6e992e137ba6a15b                                                                     0.0s
 => => naming to local.registry/docker-test/stage2:latest                                                                                                        0.0s
------
 > importing cache manifest from local.registry/docker-test/stage1:latest:
------
------
 > importing cache manifest from local.registry/docker-test/stage2:latest:
------

Note how [stage1 2/2] is not cached when building the stage2 image.

If you re-run both steps again, still nothing is cached.

Describe the results you expected:

The first stage is used as a cache when building the second stage:

$ docker build --cache-from local.registry/docker-test/stage1:latest --target stage1 --tag local.registry/docker-test/stage1:latest -f Dockerfile .
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM alpine:3.9 as stage1
 ---> 5cb3aa00f899
Step 2/2 : RUN touch /file1
 ---> Running in 60e6093cc777
Removing intermediate container 60e6093cc777
 ---> cb8d39fdacdc
Successfully built cb8d39fdacdc
Successfully tagged local.registry/docker-test/stage1:latest

$ docker build --cache-from local.registry/docker-test/stage1:latest --cache-from local.registry/docker-test/stage2:latest --target stage2 --tag local.registry/docker-test/stage2:latest -f Dockerfile .
Sending build context to Docker daemon  2.048kB
Step 1/4 : FROM alpine:3.9 as stage1
 ---> 5cb3aa00f899
Step 2/4 : RUN touch /file1
 ---> Using cache
 ---> cb8d39fdacdc
Step 3/4 : FROM stage1 as stage2
 ---> cb8d39fdacdc
Step 4/4 : RUN touch /file2
 ---> Running in 8179df339c2e
Removing intermediate container 8179df339c2e
 ---> c1ab7c10a04b
Successfully built c1ab7c10a04b
Successfully tagged local.registry/docker-test/stage2:latest

Note how “Step 2/4” in the second command is cached.

Also if you re-run both steps again, everything is cached.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:           18.09.4
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        d14af54
 Built:             Wed Mar 27 18:36:04 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.4
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       d14af54
  Built:            Wed Mar 27 18:04:46 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:

Containers: 52
 Running: 0
 Paused: 0
 Stopped: 52
Images: 132
Server Version: 18.09.4
Storage Driver: overlay2
 Backing Filesystem: btrfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 5.0.4-200.fc29.x86_64
Operating System: Fedora 29 (Workstation Edition)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.56GiB
Name: rocky
ID: EZ5T:7HUA:5BNM:CT23:O7MU:HHXB:AMME:37UC:Y3P3:TR3U:X6GN:GBTX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

Additional environment details (AWS, VirtualBox, physical, etc.):

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 33
  • Comments: 32 (8 by maintainers)

Commits related to this issue

Most upvoted comments

The problem: CI using docker-in-docker, the docker engine is started anew for each build, it is “empty”. To speed up the builds with the current docker build we can pull an older image and make docker builder use it as layer cache.

Is this possible to achieve with the cache-efficient, Dockerfile agnositc buildkit? If yes, how? If no, please say so, so we can give up.

What is --build-arg BUILDKIT_INLINE_CACHE=1 and where is it documented? What does “make sure to inline the cache metadata into the image you are importing” mean?

For anyone this can help, to use --cache-from you need to enable BuildKit, and the image source where you’re using to cache from must be built with BUILDKIT_INLINE_CACHE as build argument. Here’s what I’m doing:

$ docker pull my-remote-app || true
$ DOCKER_BUILDKIT=1 docker build . --tag my-app --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from my-remote-app
$ docker tag my-app my-remote-app
$ docker push my-remote-app

After the first build, from subsequent builds it’ll use cache from image source. It’s working for me, the time it takes to build subsequent images is reduced significantly.

A little bit of my context: I’m running self-hosted Github Runner on my K8S cluster where each of the Runner runs on different node, it makes it doesn’t usually utilise cache from local tagged image, I tried to organize my Dockerfile (to leverage Docker layer caching) but it just works if all Docker builds are run on a single machine, then I have to use --cache-from with image pulled from remote before build instead, so now no matter node my Docker build job is running, it always cache from remote image source

The integration of buildkit in docker (with DOCKER_BUILDKIT) is so confusing. It seems --cache-from cannot work for technical reasons that i don’t entirely grasp. As an end-user of docker/moby I don’t think i’m supposed to even know about builtctl or how a container is designed internally.

If there are limitations, it would be good to have them properly documented. Also, invalid options should not allowed in the CLI. In general if something fails there should be a very clear error, otherwise this is only misleading because it makes no difference with a regular cache miss.

The relevant commands from our CI:

  1. export DOCKER_BUILDKIT=1
  2. pull the latest built image from the repository: docker pull $BUILD_IMAGE_LATEST
  3. build the new image as: docker build . --cache-from $BUILD_IMAGE_LATEST --build-arg BUILDKIT_INLINE_CACHE=1 -t $BUILD_IMAGE_LATEST
  4. replace the latest built image in the repository: docker push $BUILD_IMAGE_LATEST

As far as I understand now the BUILDKIT_INLINE_CACHE build arg is necessary for the image to be usable as layer cache by the build kit.

The image you point to with --cache-from needs to be built with BUILDKIT_INLINE_CACHE. --cache-from imports cache, BUILDKIT_INLINE_CACHE exports to the built image. Setting BUILDKIT_INLINE_CACHE while --cache-from points to some image built without buildkit/inline-cache has no effect.

Use 19.03 and make sure to inline the cache metadata into the image you are importing from with DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 .. Everything should work properly then.

Hi,

I’m following the exact same procedure explained by @artm and the cache still doesn’t work on Docker 19.03.5. All layers are rebuilt again and again.

I start with a clean state with no docker images. Then my script pulls the “latest” image, then builds using BUILDKIT_INLINE_CACHE and --cache-from. However it still builds all the layers again. Is there something else I need to do?

EDIT: the image I pull from is stored in a private remote repository.

Are there some news about this issue? This looks quite an essential feature and it’s been annoying me for quite some time before finding this ticket. I tried all different ways to name the images to avoid this but i can’t make it work 😢

It seems that buildkit always tries to resolve to docker.io whether you are using a local registry (such as described here with local.registy) or simply local images (served by the local daemon i guess)! I am connected to a private registry but I’d just like to build local images in a multi-stage CI setup, where i definitely don’t want to push that images all the time, only under certain conditions. Actually i see two different problems:

  • standard cache import with --cache-from (as described above). Maybe this can be solved with buildctl --import-cache.
  • build with a Dockerfile having FROM local-image! In a multi-stage case this is even more problematic than just the cache. I tried this last point both with DOCKER_BUILDKIT=1 docker build and buildctl. It’s the same issue and it looks like the same reason as for the cache 😒

Furthermore, I want to build an image locally with no image name simply keeping the SHA-ID using the --iidfile option. I don’t really want to name the images because it would create some conflicts with other processes. I think the proper naming is name:tag@sha-id but i don’t know exactly the spec here. In the worst time i could generate a temporary fake name but it is sad if it doesn’t behave like the standard docker build. If you give a FROM <SHA-id> without any name it works. But maybe i won’t have the need with buildctl… if i can finally use it.

All of this worked perfectly well with the standard docker build, without the need for a local registry, just using the local images directly. First of all I would like to use the new buildkit for the --secret feature. But now i feel buildkit has a lot of very powerful features to do more advanced stuff, especially to play with the cache! But at least the impossibility to use FROM local-image prevents me to do so 😞

After googling around I noticed that the file permissions are indeed different between CI and my local computer, thus that layer is missing the cache. Mistery solved! Thanks for the help.

I wanted to share cache between Windows and Linux machines and had a similar problem as @carlosgalvezp Cache for COPY was always invalidated, because Windows has no Linux file permissions and I had also different timestamps (created at, modified at)

I solved this by adding a new build stage, which cleans up all the file permissions, so that they are always same, in the build stage, that I wanted to be cached. See Dockerfile: Remote build cache optimization for COPY (on Windows) for detailed description, what I did

For anyone this can help, to use --cache-from you need to enable BuildKit, and the image source where you’re using to cache from must be built with BUILDKIT_INLINE_CACHE as build argument. Here’s what I’m doing:

$ docker pull my-remote-app || true
$ DOCKER_BUILDKIT=1 docker build . --tag my-app --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from my-remote-app
$ docker tag my-app my-remote-app
$ docker push my-remote-app

After the first build, from subsequent builds it’ll use cache from image source. It’s working for me, the time it takes to build subsequent images is reduced significantly.

A little bit of my context: I’m running self-hosted Github Runner on my K8S cluster where each of the Runner runs on different node, it makes it doesn’t usually utilise cache from local tagged image, I tried to organize my Dockerfile but it just works if all Docker builds are run on a single machine, then I have to use --cache-from with image pulled from remote before build instead, so now no matter node my Docker build job is running, it always cache from remote image source

Thank you @kekru and @maitrungduc1410 I was able to get it working using both of your recommendations 🎉

After googling around I noticed that the file permissions are indeed different between CI and my local computer, thus that layer is missing the cache. Mistery solved! Thanks for the help.

There is a gotcha on file permissions that might not be immediately obvious - SELinux permissions. If you build your image on a machine without SELinux and try to use the cache on a machine with SELinux (or vice-versa), your file hashes won’t match so the COPY cache is invalidated.

@Chili-Man thanks for your investigation!

We have managed to make docker build with buildkit use previous images as layer cache. If I will not forget I’ll provide more details when I’m back to work. I will also want to link that BUILDKIT_INLINE_CACHE documentation from our internal docs, so thanks once more.

Use 19.03 and make sure to inline the cache metadata into the image you are importing from with DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 .. Everything should work properly then.

What if you don’t want to use inline caching, but want to export and import the cache locally? It’s unclear what the correct usage is when using DOCKER_BUILDKIT=1 docker build.

I wanted to share cache between Windows and Linux machines and had a similar problem as @carlosgalvezp
Cache for COPY was always invalidated, because Windows has no Linux file permissions and I had also different timestamps (created at, modified at)

I solved this by adding a new build stage, which cleans up all the file permissions, so that they are always same, in the build stage, that I wanted to be cached.
See Dockerfile: Remote build cache optimization for COPY (on Windows) for detailed description, what I did

If you have issues with this please make a runnable reproducer and post it as a new issue in moby/buildkit

I think the issue is closable now.

BUILDKIT_INLINE_CACHE is documented in https://github.com/moby/buildkit#inline-push-image-and-cache-together, but maybe it should be also documented in Docker docs (not scope of this repo).

@artm I found the documentation for --build-arg BUILDKIT_INLINE_CACHE=1 here https://github.com/docker/buildx#--cache-tonametypetypekeyvalue ; however, I tried using that and it didn’t work as expected; I ended up having to use docker buildx --cache-to=type=inline,mode=all for it to work. And if your using the official docker DIND images, they don’t come with buildx installed according to @thaJeztah

The linked issue comment https://github.com/moby/buildkit/issues/723#issuecomment-440490796 says that the issue is that the built image is not suitable for cache re-use in buildkit but I’m seeing that even if the image built was built without buildkit, buildkit is still not capable of taking advantage of teh cache. So this isn’t just an export bug, it’s also missing on import.