moby: error creating zfs mount: no such file or directory
Description
docker build
reports no such file or directory
when running docker build
on a ZFS.
Steps to reproduce the issue:
Execute docker build
when /var/lib/docker
is mounted on a ZFS filesystem.
Describe the results you received:
Step 7/7 : COPY ${APP_BINARY_FILENAME} ${TOMCAT_HOME}/webapps/${APP_BINARY_FILENAME}
error creating zfs mount of kpool/docker/192a72289b6040ea928e8873c5f2695029bf856ada2c232df312a17775ddae9a to /var/lib/docker/zfs/graph/192a72289b6040ea928e8873c5f2695029bf856ada2c232df312a17775ddae9a: no such file or directory
ERROR: Job failed: error executing remote command: command terminated with non-zero exit code: Error executing in Docker Container: 1
Describe the results you expected: No errors
Additional information you deem important (e.g. issue happens only occasionally):
Intermittent error. Usually docker build
succeeds after three to four retries. Probably a race condition.
# df -h /var/lib/docker
Filesystem Size Used Avail Use% Mounted on
kpool/docker 45G 14M 45G 1% /var/lib/docker
# rpm -q zfs
zfs-0.7.8-1.el7_4.x86_64
Output of docker version
:
Client:
Version: 18.03.1-ce
API version: 1.37
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:20:16 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm
Server:
Engine:
Version: 18.03.1-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:23:58 2018
OS/Arch: linux/amd64
Experimental: false
Output of docker info
:
Containers: 14
Running: 14
Paused: 0
Stopped: 0
Images: 49
Server Version: 18.03.1-ce
Storage Driver: zfs
Zpool: kpool
Zpool Health: ONLINE
Parent Dataset: kpool/docker
Space Used By Parent: 3222265856
Space Available: 48431497216
Parent Quota: no
Compression: lz4
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-862.2.3.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 15.51GiB
Name: hseeckm01
ID: VE37:YIDL:4NAH:TOQR:SUED:MTR5:MZBK:ZJI6:BVQV:CGMR:TVRX:XA2X
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
HTTP Proxy: http://proxy.company.com:8080
HTTPS Proxy: http://proxy.company.com8080
No Proxy: localhost,127.0.0.1,.internal.company.com
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: bridge-nf-call-ip6tables is disabled
Additional environment details (AWS, VirtualBox, physical, etc.):
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 6
- Comments: 17
PS. the only a workaround I found for this issue is not to use the zfs storage driver at all (explicitly set
"storage-driver": "aufs"
in /etc/docker/daemon.json) Of course if you want to do this on an existing server, you will have to backup your volumes, destroy your data root and re-create it.@stephan2012 I can confirm that this is still happening with containerd.io 1.4.3 and docker-ce 20.10.3 (from Docker’s official repo for debian buster)
This seems to be a race condition that happens when Docker’s root is on a ZFS volume and the build is multi-stage, containing a
COPY --from=...
If the source stage takes more time to build than the destination stage (and is not already in cache) this triggers the bug.The reason it seems to work every n-th time is that the race condition is not triggered if the previous build step is already in cache. The way to reproduce this is:
data-root
is on a ZFS volume (even if daemon.json does not explicitly set"storage-driver": "zfs"
)COPY --from=...
commands (any such build should trigger the bug, as long as the source stage take more time to build than the destination stage)--no-cache
See also https://github.com/moby/buildkit/issues/1758 it is the same bug IMHO