buildx: New docker-container builders fail first bake using local cache
Behavior
Using a fresh buildx docker-container builder, a bake using a (populated) local cache and a build context (i.e. COPY, RUN --mount, etc.) will fail with one of ERROR: failed to solve: Canceled: grpc: the client connection is closing or ERROR: failed to solve: Unavailable: error reading from server: EOF
Desired behavior
The first build with a fresh builder must succeed against a local cache for practical use of the local cache in CI applications. With a builder that has already baked the images, this issue become intermittent. That case should also succeed consistently.
Environment
docker info:
Client:
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.9.1)
compose: Docker Compose (Docker Inc., v2.10.2)
extension: Manages Docker extensions (Docker Inc., v0.2.9)
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc., 0.6.0)
scan: Docker Scan (Docker Inc., v0.19.0)
Server:
Containers: 13
Running: 4
Paused: 0
Stopped: 9
Images: 59
Server Version: 20.10.17
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.10.124-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 5
Total Memory: 7.667GiB
Name: docker-desktop
ID: P2BC:5HXV:5ELQ:YK6I:LRNJ:PVRL:FJ76:EZ7P:H2QB:QVXD:ON2C:AUVO
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5000
127.0.0.0/8
Live Restore Enabled: false
Steps to reproduce
Prepare the files (unzip this to skip):
$ mkdir base
$ mkdir layer
$ touch base/Dockerfile
$ touch base/file
$ touch layer/Dockerfile
$ touch images.json
base/Dockerfile:
FROM ubuntu as base
RUN sleep 2
COPY file file
layer/Dockerfile:
FROM base_target as layer
RUN sleep 5
images.json:
{
"target": {
"common": {
"platforms": [
"linux/amd64"
]
},
"base": {
"context": "base",
"cache-from": [
"type=local,src=../cache/base"
],
"cache-to": [
"type=local,mode=max,dest=../cache/base"
],
"inherits": ["common"],
"tags": [
"base"
]
},
"layer": {
"context": "layer",
"cache-from": [
"type=local,src=../cache/layer"
],
"cache-to": [
"type=local,mode=max,dest=../cache/layer"
],
"contexts": {
"base_target": "target:base"
},
"inherits": ["common"],
"tags": [
"layer"
]
}
}
}
Create the builder:
docker buildx create --name container_driver_builder --driver docker-container
Populate the cache:
docker buildx bake --builder container_driver_builder -f images.json layer
For each subsequent test, remove the builder, recreate it, and rebuild the bake targets:
docker buildx rm container_driver_builder \
&& docker buildx create --name container_driver_builder --driver docker-container \
&& docker buildx bake --builder container_driver_builder -f images.json layer
Each such test fails with ERROR: failed to solve: Canceled: grpc: the client connection is closing or ERROR: failed to solve: Unavailable: error reading from server: EOF.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 3
- Comments: 15 (4 by maintainers)
This issue completely breaks my ability to build and run any docker devcontainer in VS Code on Ubuntu 22.04. I’ve been able to block it in the
devcontainer.jsonusing theargs: {}section. It seems to listen to me when I addBUILDKIT_INLINE_CACHE=0.This was a new install. Firs time in a while. This will probably seriously disrupt a lot of people who use devcontainers in Docker.
Confirmed same error here.
I can confirm this new error. Happens on gitlab CI/CD with docker:dind service when building docker container. Started happening today, with no changes to any docker/CI or related files.
Not sure, if it’s related to this issue. But since the upgrade to Docker version 23.0.0 which embeds buildx 0.10.2 as its default builder, some people are encountering issues during building a devcontainer (a feature from VSCode).
The error
Fail to build a devcontainer: ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOFseems to relate to the same one as shown in the description. Could someone confirm that it’s the same issue or a completely different one?@crazy-max I understand. I’m using the official docker images from docker hub in my CI/CD pipeline. And from what I can tell 23.0.1 was not released yet. The latest one that’s pushed seems to be 23.0.0.
I got a stacktrace to better understand this issue
So the case is that first builds loads cache but it remains only a lazy ref https://github.com/moby/buildkit/blob/v0.10.4/cache/remote.go#L336 created with provider from session. Then a second build comes in when the first session is already dropped and matches against the previous lazy ref. Then unlazy gets called and fails because of the session is already gone.
@sipsma @ktock
I guess the simplest fix is to try to disable lazy behavior for local cache imports from session because it seems fragile.
More proper fixes would be to make sure lazy ref is not matched if it is a different session or add the current session to the group(not sure if this is quite safe actually).
On bake we might need a fix as well to keep the original session alive until all builds have completed. I’m thinking of the case where a “local source” would need to be pulled in by a subsequent build(not sure how practical). But I think this cache issue could appear by just doing two individual builds with same cache source from two different terminals.
@jaudiger @bwenzel2 @m-melis @antonioconselheiro Same as https://github.com/moby/buildkit/issues/3576, will be fixed with https://github.com/moby/moby/pull/44920.
Closing this issue since it has been fixed in BuildKit 0.11.2 (https://github.com/moby/buildkit/pull/3493)
We’re also seeing the same error suddenly appear despite not changing anything CI/CD or Docker related on our end for months. GitLab CI/CD running dind with Docker 20.10.13 with
BUILDKIT_INLINE_CACHE=1. Followed @jaudiger’s suggestion above, and removing theBUILDKIT_INLINE_CACHE=1build arg does seem to fix the issue, but I’m curious why this would suddenly break without us having upgraded or changed anything on our end. Wondering if something changed on the DockerHub side, since that’s where all our repos are.