moby: failed to get event and rpc error "connect: connection refused"
Description
In my environment, my docker occasionally happened the issue as follows: The service status of docker is running, but docker cannot run any new container or remove any old container. In actually, the docker is wrong.
It will print the same error logs as follows:
nection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:47 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543529097+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:47 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543585977+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:47 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543616057+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:47 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543666737+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543704917+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543743937+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543781397+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543820037+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543858637+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543900237+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543934977+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.543990177+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544012357+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544066977+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544089937+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544147137+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544167057+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544227537+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544302097+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544250097+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544379097+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544451877+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544402857+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544529097+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544550937+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544608457+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby
10月 24 17:15:48 slave2 dockerd[3948]: time="2019-10-24T17:15:47.544627677+08:00" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
When I exec docker run <caontainer>
, I got:
docker run 6cf7c80fe444
docker: Error response from daemon: all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused": unavailable.
ERRO[0000] error waiting for container: context canceled
Steps to reproduce the issue: I don’t know how to reproduce it, it will only recover normal until I reboot the node that docker running on.
Additional information you deem important (e.g. issue happens only occasionally):
Output of docker version
:
root@slave2:~# docker version
Client:
Version: 18.09.8
API version: 1.39
Go version: go1.10.8
Git commit: 0dd43dd
Built: Wed Jul 17 17:45:38 2019
OS/Arch: linux/arm64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.8
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 0dd43dd
Built: Wed Jul 17 17:07:47 2019
OS/Arch: linux/arm64
Experimental: false
Output of docker info
:
root@slave2:~# docker info
Containers: 79
Running: 59
Paused: 0
Stopped: 20
Images: 189
Server Version: 18.09.8
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.4.58-20180615.kylin.server.YUN+-generic
Operating System: Kylin 4.0.2
OSType: linux
Architecture: aarch64
CPUs: 16
Total Memory: 62.89GiB
Name: slave2
ID: RBMJ:WBJ5:VBS5:SKYH:BVJK:6KH6:MIPN:QAWB:2ALL:XGPG:STBD:SMJW
Docker Root Dir: /opt/cke/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
registry.icp.com:5000
registry.inspurspring.com
docker.inspur.com:5000
10.150.0.0/16
127.0.0.0/8
Live Restore Enabled: true
Product License: Community Engine
Additional environment details (AWS, VirtualBox, physical, etc.): physical
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 7
- Comments: 27 (2 by maintainers)
Same problem here. Stopped Ubuntu Snap docker and syslog spam stopped:
“snap stop docker”
snap services
Service Startup Current Notes bootstack-elasticsearch.elasticsearch enabled inactive - docker.dockerd enabled inactive - graylog.graylog enabled active -
Same here: Ubuntu 18.04.3 Server Docker version 18.09.7, build 2d0083d
Update:
Update 2: Snap tells me I’ve got
docker 18.09.9 418 stable canonical✓ -
but it’sDocker version 18.09.7, build 2d0083d
- reinstall of docker via snap doesn’t help.Update 3: FIX
systemctl disable containerd
and everything is happierHaven’t had a chance to dig deep, but the problem seems to be a conflict of snap.docker.dockerd.service and containerd.service.
Disabling/stopping containerd seems to have solved the issue for me.
This is what I did as a temporary fix:
Firstly, stop the error message. I just did
kill <pid>
where the pid is number indockerd[12345]
in the error message (read this viatail -f /var/log/syslog
)To clear the syslog I did
cat /dev/null > /var/log/syslog
It is definitely not a coincidence from the cosmos that 9 people have got reported this problem in the last 4 hours…