moby: This multistage Dockerfile almost always results in `layer does not exist` error

Description

The following Dockerfile almost always results in layer does not exist error:

FROM golang:1.8-alpine AS go-build-base
ENV PATH=/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
ENV GOPATH=/go
RUN apk add --no-cache g++ linux-headers
RUN apk add --no-cache git make

FROM alpine:latest AS runc-git
RUN apk add --no-cache git
RUN git clone https://github.com/opencontainers/runc.git /go/src/github.com/opencontainers/runc
WORKDIR /go/src/github.com/opencontainers/runc
RUN git checkout -q v1.0.0-rc3

FROM go-build-base AS runc
COPY --from=runc-git /go /go
WORKDIR /go/src/github.com/opencontainers/runc
RUN go build -o /usr/bin/runc ./

FROM alpine:latest AS buildkit-git
RUN apk add --no-cache git
RUN git clone https://github.com/moby/buildkit.git /go/src/github.com/moby/buildkit
WORKDIR /go/src/github.com/moby/buildkit
# Jul 5, 2017
RUN git checkout -q 8e2267320e0fb2b6ae3d22b3d999f6377b7b8758

FROM go-build-base AS buildkit-src
COPY --from=buildkit-git /go /go
WORKDIR /go/src/github.com/moby/buildkit

FROM buildkit-src AS buildd-standalone
RUN go build -o /bin/buildd-standalone -tags standalone ./cmd/buildd

FROM buildkit-src AS buildctl
RUN go build -o /bin/buildctl ./cmd/buildctl

FROM alpine:latest
COPY --from=buildctl /bin/buildctl /bin
COPY --from=runc /usr/bin/runc /bin
COPY --from=buildd-standalone /bin/buildd-standalone /bin
RUN ls -l /bin

Steps to reproduce the issue:

  1. (Optional step for ensuring clean environment)Stop the daemon, do rm -rf /var/lib/docker, and start the daemon
  2. Do docker build --no-cache .

Describe the results you received:

...
Step 20/31 : FROM go-build-base AS buildkit-src
 ---> f234db884287                                        
Step 21/31 : COPY --from=buildkit-git /go /go  
failed to export image: failed to create image: failed to get layer sha256:102ac1be15dcaef8e5bdade6e75c71b6ec76a67eb9555987aaa5460dd2b7d3b2:
layer does not exist

Describe the results you expected:

No error

Additional information you deem important (e.g. issue happens only occasionally):

  • When it fails, it seems always failing at Step 21/31 : COPY --from=buildkit-git /go /go
  • Happens on both overlay2 and aufs. So should not be related to graph driver.
  • docker build . sometimes succeeds but never seen successful docker build --no-cache.

Output of docker version:

Client:
 Version:      unknown-version
 API version:  1.31
 Go version:   go1.8
 Git commit:   e672589e
 Built:        Thu Jul  6 06:23:30 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.0-dev
 API version:  1.31 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   9d95740db
 Built:        Thu Jul  6 05:14:08 2017
 OS/Arch:      linux/amd64
 Experimental: true

Output of docker info:

Containers: 1
 Running: 0
 Paused: 0
 Stopped: 1
Images: 35
Server Version: 17.06.0-dev
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 3addd840653146c90a254301d6c3a663c7fd6429
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.10.0-26-generic
Operating System: Ubuntu 17.04
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 29.45GiB
Name: ws01
ID: SN3T:CNK6:JKQD:54CY:XKF2:BRX3:CIRU:DQBT:6DVZ:VWQ2:Q5ET:F23L
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 26
 Goroutines: 43
 System Time: 2017-07-06T06:52:16.093390156Z
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.):

cc @tonistiigi

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 48 (24 by maintainers)

Commits related to this issue

Most upvoted comments

Problem still exists in Docker version 18.03.1-ce, build 9ee9f40 running on Ubuntu 18.04:

Step 22/27 : ADD nextcloud.conf /nextcloud.conf
failed to export image: failed to create image: failed to get layer sha256:560e97685aeff18a6dbcdc2294a89c24ed286f0eeafd60274e3f71a922cec013: layer does not exist

It happend with this version of my nextcloud. Docker hub succeeded.

It’s not enough to change just the last line, the issue reproduces if you revert the entire commit which fixed the issue, change all the lines to have a destination of /cncf/. With that change the bug is reproducible on master.

It’s worth mentioning that the Dockerfile is wrong, copying directories to the same directory will replace the directory not copy into the directory, so the previous COPY lines are useless. There is still an issue here, but it’s unlikely to actually break anyone with a correct Dockerfile (from what I can tell right now).

I was able to reproduce the issue with this:

#!/usr/bin/env bash
set -eu

mkdir 1
echo "absdf" > 1/a
cp -r 1 2; cp -r 1 3

cat << EOF > Dockerfile
FROM    alpine:3.6
COPY    1/ /target/
COPY    2/ /target/
COPY    3/ /target/
EOF

docker build --no-cache .

Empty directories, or directories with empty files do not work, they must contain files with some content.

I noticed in the logs that the copies say this:

Applied tar x to foo1, size: 6
Applied tar y to foo2, size: 0
Applied tar z to foo3, size: 6

The second copy creates the invalid state, and the final one fails. I’m looking into the exact cause now and will open a PR with the fix.

@jcberthon could you open a new issue with those details, but als include your docker version and docker info there?

At Quay.io we got several reports of this error during user builds after updating our build cluster to 17.09.0-ce. The issue was resolved by ensuring that the build machines use overlay2 as opposed to overlay for a storage driver. It seems the original reporter is already on overlay2 tho so perhaps it is not a full fix.

I’m running into the same problem with Docker version 17.10.0-ce, build f4ffd25

“failed to export image: failed to create image: failed to get layer sha256:xxx: layer does not exist”

Always happens with a COPY directive in the Dockerfile.

yup, I will look into it today