moby: overlay2 + linux v4.13: error creating overlay mount to /var/lib/docker/overlay2/ID/merged: device or resource busy

Description

The overlay driver in the kernel, starting with 4.13, will return an error for overlay mounts that re-use the upper dir. This error was introduced in this patch.

Using docker 17.06.1-ce on the 4.13-rc6 kernel I can unreliably reproduce this error message. I’ve only ever observed it on the first container run, and only infrequently. I assume that there are two mounts that race and sometimes clash.

Steps to reproduce the issue:

Install the 4.13 kernel
Boot the machine with an empty /var/lib/docker directory. Start dockerd, and as soon as possible, run a few dozen containers in parallel.
Occasionally, (perhaps 1 out of 30 runs), get the error “error creating overlay mount to /var/lib/docker/overlay2/ID/merged: device or resource busy”
Note that the dmesg output includes “overlayfs: upperdir is in-use by another mount”

Note that this only impacts running multiple containers at once. Serializing all container runs avoids it.

Output of docker version:

$ docker version
Client:
 Version:      17.06.1-ce
 API version:  1.30
 Go version:   go1.8.2
 Git commit:   874a737
 Built:        Sat Aug 26 01:07:04 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.1-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.2
 Git commit:   874a737
 Built:        Fri Aug 25 18:06:27 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 17.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: v0.13.2 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 seccomp
  Profile: default
 selinux
Kernel Version: 4.13.0-rc6-coreos
Operating System: Container Linux by CoreOS 1506.0.0+2017-08-25-1813 (Ladybug)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 996.3MiB
Name: localhost
ID: ZBQY:PD55:UTX2:K2N4:CPQJ:HWIY:SOIQ:IC6P:NNXT:YKUZ:XFNP:ESWJ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

I’ve seen it on AWS and Qemu, presumably happens on all.

I’ve also reported this issue over here on the CoreOS bug tracker.

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 8
Comments: 37 (17 by maintainers)

Commits related to this issue

libcontainer: default mount propagation correctly The code in prepareRoot (https://github.com/opencontainers/runc/blob/e385f67a0e45fa1d8ef8154e2aea5128ea1d331b/libcontainer/rootfs_linux.go#L599-L605)... — committed to euank/runc by euank 7 years ago
libcontainer: default mount propagation correctly The code in prepareRoot (https://github.com/opencontainers/runc/blob/e385f67a0e45fa1d8ef8154e2aea5128ea1d331b/libcontainer/rootfs_linux.go#L599-L605)... — committed to euank/runc by euank 7 years ago
app-emulation/docker: apply ebusy overlayfs patch See https://github.com/coreos/bugs/issues/2127 and https://github.com/moby/moby/issues/34672 for discussion. Patch files have been split into more f... — committed to euank/coreos-overlay by euank 7 years ago
app-emulation/docker: apply ebusy overlayfs patch See https://github.com/coreos/bugs/issues/2127 and https://github.com/moby/moby/issues/34672 for discussion. Patch files have been split into more f... — committed to euank/coreos-overlay by euank 7 years ago
app-emulation/docker: apply ebusy overlayfs patch See https://github.com/coreos/bugs/issues/2127 and https://github.com/moby/moby/issues/34672 for discussion. Patch files have been split into more f... — committed to euank/coreos-overlay by euank 7 years ago
ovl: fix regression caused by exclusive upper/work dir protection Enforcing exclusive ownership on upper/work dirs caused a docker regression: https://github.com/moby/moby/issues/34672. Euan spotted... — committed to amir73il/linux by amir73il 7 years ago
ovl: fix regression caused by exclusive upper/work dir protection Enforcing exclusive ownership on upper/work dirs caused a docker regression: https://github.com/moby/moby/issues/34672. Euan spotted... — committed to euank/linux by amir73il 7 years ago
ovl: fix regression caused by exclusive upper/work dir protection Enforcing exclusive ownership on upper/work dirs caused a docker regression: https://github.com/moby/moby/issues/34672. Euan spotted... — committed to amir73il/linux by amir73il 7 years ago
ovl: fix regression caused by exclusive upper/work dir protection commit 85fdee1eef1a9e48ad5716916677e0c5fbc781e3 upstream. Enforcing exclusive ownership on upper/work dirs caused a docker regressio... — committed to Whissi/linux-stable by amir73il 7 years ago
ovl: fix regression caused by exclusive upper/work dir protection BugLink: http://bugs.launchpad.net/bugs/1723145 commit 85fdee1eef1a9e48ad5716916677e0c5fbc781e3 upstream. Enforcing exclusive owner... — committed to M-Bab/linux-kernel-amdgpu by amir73il 7 years ago
ovl: fix regression caused by exclusive upper/work dir protection BugLink: http://bugs.launchpad.net/bugs/1723145 commit 85fdee1eef1a9e48ad5716916677e0c5fbc781e3 upstream. Enforcing exclusive owner... — committed to walbon/ubuntu-artful by amir73il 7 years ago
ovl: fix regression caused by exclusive upper/work dir protection BugLink: http://bugs.launchpad.net/bugs/1723145 commit 85fdee1eef1a9e48ad5716916677e0c5fbc781e3 upstream. Enforcing exclusive owner... — committed to M-Bab/linux-kernel-amdgpu by amir73il 7 years ago

Most upvoted comments

I just experienced this myself on Arch - thanks for your help resolving it!

For other Arch users looking for the tl;dr version of what people above are suggesting, create a file /etc/modprobe.d/overlay.conf with the following contents:

options overlay index=off

and reboot (or rmmod overlay && modprobe overlay). That resolved the issue for me.

+41

pedantic-git on Dec 11, 2017

I’ve encountered this issue everytime I do a system upgrade ( Arch - Linux Kernel 4.14.6-1 ). A way to fix this without having to reboot your machine is the following chain of commands ( restarting your docker and regenerating the dependency tree for system configuration services ).

Bear in mind that this will only work if you have docker managed as a service rather than running it manually

systemctl stop docker.service && systemctl start docker.service && systemctl daemon-reload

As a sidenote, I haven’t come across this issue during runtime, only after upgrading systemwide dependencies.

+11

peterver on Dec 20, 2017

This looks like another variant of #34573 (effectively mount leaks are causing EBUSY in a variety of places). Honestly we should re-architect Docker/containerd so that it spawns runC in different mount namespaces (with the rootfs mount only done privately in each mount namespace) so that the mounts won’t be leakable to other runcs.

cyphar on Sep 22, 2017

@huegelc Yes, as of 4.13.6. There are two possible scenarios which will result in this specific issue (based on my understanding):

Kernel between 4.13.0 and 4.13.6 — update your kernel or wait for docker to include #34948
Kernel >= 4.13.6, but with CONFIG_OVERLAY_FS_INDEX=y — disable that feature (either at module load time or by setting it to n) or wait for docker to include #34948

Without #34948, there will still be warnings in dmesg in those two scenarios, but they can be safely ignored.

Because this is not a harmful issue on an up-to-date kernel and because the moby codebase (if not docker) has been updated to handle this issue better, I’m closing this bug.

euank on Nov 29, 2017

After a bit more adventuring, I believe I’ve tracked this down to two distinct problems which both cause this issue identically!

First, a helpful trick for reproducing this – add a 1-3 second sleep before pivoting root: https://github.com/opencontainers/runc/blob/593914b8bd5448a93f7c3e4902a03408b6d5c0ce/libcontainer/rootfs_linux.go#L98-L103

And a few hundred ms in the Put code here: https://github.com/moby/moby/blob/ba317637de9b9918cdc2139466dd51c6200bd158/daemon/graphdriver/overlay2/overlay.go#L610

After those changes, it reliably reproduces running just two containers at once, which made it much easier to continue investigating.

Anyways, it turns out that if you look through every mount namespace for references to the overlayfs mount, you’ll find that the runc init process for another container sometimes has a copy of it still mounted, despite it being unmounted and gone from the host mount namespace.

This copy is a private copy of the mount meaning our host umount won’t get it, and we’re at the mercy of this other container’s runc init to eventually clean it up.

I’ve created a commented reproduction of what docker and runc are doing namespace and mount wise leading up to this ebusy:

#!/bin/bash
set -x

# c1 and c2 represent two different docker containers starting at once
c1=1
c2=2

function ovlOpts() {
	echo -n "lowerdir=$tmpdir/lower,upperdir=$tmpdir/$1/diff,workdir=$tmpdir/$1/work"
}

tmpdir=$(mktemp -d) # 'overlay2' graphdriver dir

mkdir -p $tmpdir/{$c1,$c2}/{diff,merged,work}
mkdir -p $tmpdir/lower

# overlay2 driver in its setup code does this
# https://github.com/moby/moby/blob/ba317637de9b9918cdc2139466dd51c6200bd158/daemon/graphdriver/overlay2/overlay.go#L178
mount --bind $tmpdir $tmpdir
mount --make-private $tmpdir

# Container 2 sets up 
# https://github.com/moby/moby/blob/ba317637de9b9918cdc2139466dd51c6200bd158/daemon/graphdriver/overlay2/overlay.go#L589
mount -t overlay overlay -o "$(ovlOpts $c2)" $tmpdir/$c2/merged 

# Container 1 starts setting up
mount -t overlay overlay -o "$(ovlOpts $c1)" $tmpdir/$c1/merged 

# Container 2 runs 'runc init' code in parallel
(
  # https://github.com/opencontainers/runc/blob/8b47a242a9aebdfe1c0c2b6513368f736d505bf0/libcontainer/nsenter/nsexec.c#L823
  unshare -m --propagation unchanged -- bash <<EOF
  # Now runc init remounts /
  # https://github.com/opencontainers/runc/blob/e385f67a0e45fa1d8ef8154e2aea5128ea1d331b/libcontainer/rootfs_linux.go#L599-L605
  # Due to how the config conversion works, config.RootPropagation is never 0,
  # and defaults instead to MS_PRIVATE | MS_REC. I'll PR a fix
  mount --make-rprivate /
  # Now a bunch of init stuff happens, including premount cmds and hooks
  sleep 1
  # .. and then pivot root happens which cleans up our old root
  # It's hard to do in shell, so we'll just pretend an umount of / is close enough
  # https://github.com/opencontainers/runc/blob/e385f67a0e45fa1d8ef8154e2aea5128ea1d331b/libcontainer/rootfs_linux.go#L676
  cd /
  umount -l .
EOF
) &
   
sleep 0.5

# While container2 is doing its init, container 1 unmounts and remounts its overlay
umount $tmpdir/$c1/merged
mount -t overlay overlay -o "$(ovlOpts $c1)" $tmpdir/$c1/merged 
# Boom, EBUSY on 4.13+ because `unshare -m` above has a private copy of the mount

sleep 1
# Now that the runc init code has pivoted and umounted its old root, we're able to mount without EBUSY
mount -t overlay overlay -o "$(ovlOpts $c1)" $tmpdir/$c1/merged 
umount $tmpdir/$c1/merged

# Cleanup
umount $tmpdir/$c2/merged
umount $tmpdir
rm -rf $tmpdir

Changing the runc-init mount to be rslave (as I think it was meant to be) and removing the MakePrivate call for the overlay2 graphdriver directory fixes the race. Even with the addition of the above-mentioned sleeps, I no longer am able to get EBUSY with those changes.

I’ll PR each of those changes shortly with suitable commit messages.

euank on Sep 22, 2017

@jpalczewski the index option can be set to default off as a module option, as described in the kconfig entry for it.

euank on Nov 13, 2017

@adambro Can you check your kernel options for the OVERLAY_FS_INDEX config option (e.g. with zgrep OVERLAY_FS_INDEX /proc/config.gz)?

If that option is set to yes, it’s expected that the kernel will still exhibit the same behaviour that leads to this failure, even with the above referenced patch.

As @banuchka indicates, this error shouldn’t occur on newer kernels so long as they don’t have that option set. That being said, there’s still a mount being leaked and dmesg will still show warnings.

euank on Oct 27, 2017

I reboot computer when this is happening 😃

sulliwane on Oct 24, 2017

I’ve dug into it more and I think the invalid argument messages are actually red herrings.

The EBUSY is what actually matters. With some stracing and added logging, it appears to me that the overlay2 driver’s locking mechanism is working just fine. I see the EBUSY on a mount call for a given directory even though a umount call to it had returned 0.

On a hunch, I removed the MNT_DETACH flag, but that didn’t make a difference.

At this point I suspect that this is a kernel bug. My next step is to try and write a reproduction that doesn’t involve dockerd.

euank on Sep 20, 2017