moby: Upgrade to 1.12.6-cs results in dockerd unresponsive
Description
When docker is upgraded from 1.12.3-cs4 to 1.12.6-cs9 (or cs10), dockerd hangs in specific situations.
Steps to reproduce the issue:
- Create a Ubuntu 16.04 box (I have used EC2 r3.large instance, ami-f4cc1de2, us-east-1)
- Install docker-engine 1.12.3-cs4
curl -fsSL 'https://sks-keyservers.net/pks/lookup?op=get&search=0xee6d536cf7dc86e2d7d56f59a178ac6c6238f52e' | sudo apt-key add -
add-apt-repository \
"deb https://packages.docker.com/1.12/apt/repo/ \
ubuntu-$(lsb_release -cs) \
main"
apt-get update
apt-get install --no-install-recommends \
apt-transport-https \
curl \
software-properties-common
apt-get -y install docker-engine=1.12.3~cs4-0~xenial
- Run 100 containers and try to run docker ps a few times. It will run fast.
for i in {1..100}; do docker run -d -it --restart=always --name poc_$i talves/health_poc; done
then
time docker ps -qa | wc -l
- Upgrade to 1.12.6-cs10
apt-get -y install docker-engine
- Try to run one container (the command will run forever)
docker run -d -it --restart=always --name poc_1_12_6 talves/health_poc
- Try to run docker ps (it will take ages)
time docker ps -qa | wc -l
- Downgrade to cs4
apt-get install docker-engine=1.12.3~cs4-0~xenial
8, Repeat steps 5 and 6. They will work fine. If you upgrade to 17.03-ce, it will also work fine
Describe the results you received: Fast response of dockerd, regardless of docker version
Describe the results you expected: Docker 1.12.3 and 17.03 are fast, but 1.12.6-cs9 and cs10 are very slow under certain conditions
Additional information you deem important (e.g. issue happens only occasionally): I am using a custom docker image for tests, but you will get similar results if you use other images with healthcheck enabled:
docker run -d -it --restart=always --health-cmd='curl --fail http://localhost/ || exit 1' --health-interval 10s --health-timeout 1s --health-retries 3 --name poc_0 php:7.0-apache
Output of docker version
:
Client:
Version: 1.12.6-cs10
API version: 1.24
Go version: go1.6.4
Git commit: 54bb958
Built: Mon Mar 6 03:49:00 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6-cs10
API version: 1.24
Go version: go1.6.4
Git commit: 54bb958
Built: Mon Mar 6 03:49:00 2017
OS/Arch: linux/amd64
Output of docker info
:
Containers: 101
Running: 48
Paused: 0
Stopped: 53
Images: 1
Server Version: 1.12.6-cs10
Storage Driver: devicemapper
Pool Name: docker-202:1-275400-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 1.279 GB
Data Space Total: 107.4 GB
Data Space Available: 5.401 GB
Metadata Space Used: 13.53 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.134 GB
Thin Pool Minimum Free Space: 10.74 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.110 (2015-10-30)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host null bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-64-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 14.94 GiB
Name: ip-10-69-11-232
ID: NQHR:6JJ6:REF6:6GMR:5PTF:QB4Z:EANI:PYUH:UMFI:Q2TH:ZC4Z:YUVM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
Additional environment details (AWS, VirtualBox, physical, etc.): EC2 r3.large instance, ami-f4cc1de2, us-east-1
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (16 by maintainers)
When you get a hang, can you send
SIGUSR1
to the docker daemon and there should be a stack trace in the daemon logs.Thanks!
+1
I also just got an unresponsive docker daemon (i.e. all docker commands stall indefinitely) after upgrading to 1.12.6.
@thiagoalves can you also provide the daemon logs (on top of the stacktrace requested by @cpuguy83), there may be an issue during startup.
Is
live-restore
enabled?