moby: Race condition in container creation using devicemapper "no such file or directory"

Would be a dupe of #4036, but was told to open a new bug report.

Using CentOS7:

$ docker info | grep Storage
Storage Driver: devicemapper
$ docker info |grep -i "udev.*sync"
 Udev Sync Supported: true
$ docker info | grep -i loop
(no response)

This is an example of the error I receive:

docker run -d -v /opt/Xilinx/:/opt/Xilinx/:ro -v /data/jenkins_workspaces/dds:/build --name jenkins-dds-460 jenkins/build:v2-C6
5eadc30da8d1f60a13db3392ba751b197079c579a3dbba43f6a092269a475320
Error response from daemon: Cannot start container 5eadc30da8d1f60a13db3392ba751b197079c579a3dbba43f6a092269a475320: Error getting container 5eadc30da8d1f60a13db3392ba751b197079c579a3dbba43f6a092269a475320 from driver devicemapper: open /dev/mapper/docker-253:6-101737821-5eadc30da8d1f60a13db3392ba751b197079c579a3dbba43f6a092269a475320: no such file or directory

A few days ago, I had Docker totally trash all its containers. I researched some, and learned that on CentOS 7 hosts, the preferred method is a “raw” LVM thin provisioned, so that’s what I did:

$ cat /etc/systemd/system/docker.service.d/0-move-library.conf
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon --exec-opt native.cgroupdriver=cgroupfs -H fd:// --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/vg_ex-docker--pool --storage-opt dm.use_deferred_removal=true

# Needs newer systemd than CentOS uses: https://bugzilla.redhat.com/show_bug.cgi?id=1200946
# --storage-opt dm.use_deferred_deletion=true

I’m not listing here all the LVM stuff I did, but it was working for a few days…

docker info

$ cat /etc/redhat-release 
CentOS Linux release 7.1.1503 (Core) 

$ uname -a
Linux redacted.example.com 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue Nov 3 19:10:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ docker --version
Docker version 1.9.1, build a34a1d5

$ docker info
Containers: 9
Images: 35
Server Version: 1.9.1
Storage Driver: devicemapper
 Pool Name: vg_ex-docker--pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 107.4 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 20.51 GB
 Data Space Total: 322.1 GB
 Data Space Available: 301.6 GB
 Metadata Space Used: 14.26 MB
 Metadata Space Total: 4.001 GB
 Metadata Space Available: 3.987 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.107-RHEL7 (2015-10-14)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-229.20.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
CPUs: 56
Total Memory: 251.6 GiB

It definitely seems to be a race condition, when my Jenkins tries to spawn a handful of containers at once. I manually tried this on the server:

for i in $(seq 1 25); do
  docker run -d -v /opt/Xilinx/:/opt/Xilinx/:ro -v /data/jenkins_workspaces/examples:/build --name jenkins-examples-485 jenkins/ocpibuild:v2-C6; 
  echo $?; 
  docker rm -f jenkins-examples-485; 
done

And it ran flawlessly. However, if I background them instead, to force all 25 to try to create at once:

for i in $(seq 1 25); do 
  docker run -d -v /opt/Xilinx/:/opt/Xilinx/:ro -v /data/jenkins_workspaces/examples:/build --name jenkins-ctests-474-${i} jenkins/ocpibuild:v2-C6 & 
  echo $?; 
done
b66c5c2276b40854fc9f7b21e6bf153d861a2493638fb277195ad6a9376f8d32
621dc1b143f1a8728fb93a8227231a7b4488df0d96368e65d21956a6d6272004
6c6d1b417cd343469804c16f4d8479775f3aba905014d72cdb061f141f2ba816
127fdfa18b8d332da7a26e9cd9cb2f49ee4d4a000cf7ea2fe20858fffe1d7a09
66c0100da826fb6a94c084b3f4d9c53583b3e699e8a26076035c0f09bf61ec30
d16237d791da495d711483dc02c7021ab42781c001eb46128a064b3f1ddb245e
72176e1075e4a837ae7abeee9bfaee30b2f08f27cfa3542a66ebcd6a51115326
7d10de31239df16bca6c69d31d151d5ad7f5e44036f98e394e14fa9853408824
a582799ff489507fd13b02b3e2f2032a93c7f1bfbc88d443fee4526b8bddf84d
84de8f525c76ca1b64ea0c25f63a885a0e35ffa0cf2493361c56270635149023
0f104d795cb386542f0850bf964a5f426fe01476d8b56a414fdd98c6f145a973
1ad7c18178d54a53fa4fa4cda67de75923af1c965527a713542f73ac8a58cc52
ce153bb43b1fdb4ad193e72d36e8451baedf97c8c81bbe18f19818f7492eaf74
29ccbf74238de02c96eeee1eabd2f45e9163ab0633d96465b76d10a83e81f947
f178c90fb051303180bfcd0cde370526bd4a96f37ed94b3bcf56350d989e267f
46e91895d8c5c475bee7db2dc1a4c0592d4a715628dce807eff353739c7aeb4e
4c36736d89d06d62509a70203c66b74dd6ce37e7de6e6c5a2ffb3174525eed0d
86a913665de1522922726e901a54800414cc9858679168228fa626a255704d4c
c38da55732a1ff5df62238391b042a2129ad701ab2ddb826f99c81326bf26f74
57b6e0356ca25b3746c680bd6a535c514d64f4bcb50b900b9a93018207034b72
7651b72be21fb358b3fc01bd006b6e0c8fa8eb958e1cb1719e9ac223e214770e
521cc795b9e9f65cc5651a457593b540369188e17281657488490e36b62f370a
Error response from daemon: Cannot start container 7651b72be21fb358b3fc01bd006b6e0c8fa8eb958e1cb1719e9ac223e214770e: Error getting container 7651b72be21fb358b3fc01bd006b6e0c8fa8eb958e1cb1719e9ac223e214770e from driver devicemapper: open /dev/mapper/docker-253:6-101737821-7651b72be21fb358b3fc01bd006b6e0c8fa8eb958e1cb1719e9ac223e214770e: no such file or directory
Error response from daemon: Cannot start container 521cc795b9e9f65cc5651a457593b540369188e17281657488490e36b62f370a: Error getting container 521cc795b9e9f65cc5651a457593b540369188e17281657488490e36b62f370a from driver devicemapper: open /dev/mapper/docker-253:6-101737821-521cc795b9e9f65cc5651a457593b540369188e17281657488490e36b62f370a: no such file or directory
ac57455b5030d56f163b45e7d1f05856c0429823fc708b37aa21b5925f70da37

So 2 of 25 failed.

And they “stay dead” - I can never start them, even though their status is “Created”:

$ docker ps -a | grep 521cc795b9
521cc795b9e9        jenkins/ocpibuild:v2-C6   "/bin/sleep 1h"     2 minutes ago       Created                                       jenkins-ctests-474-3
$ docker start jenkins-ctests-474-3
Error response from daemon: Cannot start container jenkins-ctests-474-3: Error getting container 521cc795b9e9f65cc5651a457593b540369188e17281657488490e36b62f370a from driver devicemapper: open /dev/mapper/docker-253:6-101737821-521cc795b9e9f65cc5651a457593b540369188e17281657488490e36b62f370a: no such file or directory
Error: failed to start containers: [jenkins-ctests-474-3]

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 2
  • Comments: 42 (19 by maintainers)

Most upvoted comments

I ran following on fedora 23 twice (with docker 1.10-dev) and I could not reproduce the problem. Is there any chance you can try running on fedora 23 and see if you can still reproduce the problem.

for i in $(seq 1 25); do docker run -d --name race-${i} fedora bash & echo $?; done