moby: docker ps hangs. How to diagnose it ?
Description
I have 66 containers running on Ubuntu VM (Vmware ESXi) DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION=“Ubuntu 14.04.5 LTS”
free -m
total used free shared buffers cached
Mem: 16047 8850 7196 13 370 969
-/+ buffers/cache: 7510 8537
Swap: 4095 0 4095
Docker version 1.12.3, build 6b644ec
Every day I come to work and my deploy system cannot connect to docker
fatal: [dev6]: FAILED! => {"changed": false, "failed": true, "msg": "ReadTimeout(ReadTimeoutError(\"UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)\",),)"}
docker ps command hangs. I cant do anything except reboot server.
How to diagnose it ?
root@docker-01:~# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 64100
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 65535
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
root@docker-01:~# cat /etc/default/docker
DOCKER_OPTS="-H tcp://10.129.4.103:2375 -H unix:///var/run/docker.sock"
root@docker-01:~# lsof -w | wc -l
167224
root@docker-01:~# docker info
Containers: 65
Running: 62
Paused: 0
Stopped: 3
Images: 601
Server Version: 1.12.3
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 783
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 4.2.0-42-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.67 GiB
Name: docker-01.public.exness.local
ID: K6WX:ERB4:NB7X:TDBL:DGMN:OKWQ:ECCZ:N5W3:CLPC:GALB:3O5E:4JUS
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 18 (8 by maintainers)
First thought: I don’t like the way
srslog
’sconnect
function demands that you mustLock
before calling it. This means we’re holding the lock far too long, and putting responsibility for it in the wrong place.I need to investigate whether it’s possible to make that locking more granular, and more focused on getting access to the
conn
rather than the use of it. (Because usingconn
should be threadsafe.)Meanwhile, have you come up with a repeatable test case or is it only happening occasionally/intermittently?
Reboot away, I’ll check out this stack trace.