moby: docker exec stop gives rpc error: code = 14 desc = grpc: the connection is unavailable, v17.04.0-ce-rc1

Description

docker exec and docker stop gives rpc error: code = 14 desc = grpc: the connection is unavailable docker-containerd daemon somehow got killed after sometime of setup bringup. dockerd and docker-containerd-shim process are running. This issue is observed with docker 1.12.x, 1.13.x, 17.03.0 and the latest version 17.04.0-ce-rc1. This issue is similar to this one https://github.com/docker/docker/issues/31074, which was closed, but the problem still exists. The above issue was closed saying this was fix in this PR https://github.com/docker/docker/pull/31662 But the problem still exists with this PR.

Describe the results you received: After few hours of setup bringup observed following things: docker ps works. dockerd daemon is up and running. docker-containerd-shim of running containers exists. docker-containerd daemon is not running.

sh-4.2# docker stop ac2a021285f8
Error response from daemon: Cannot stop container ac2a021285f8: Cannot kill container ac2a021285f893262cf308b6eb6d4533c7e3ff06779b8c529ebd914623a06141: rpc error: code = 14 desc = grpc: the connection is unavailable
sh-4.2# docker exec -it ac2a021285f8 sh
rpc error: code = 14 desc = grpc: the connection is unavailable
sh-4.2#

Additional information you deem important (e.g. issue happens only occasionally): Running docker 17.04 on centos7.3 with kernel 4.8.3-1.el7.elrepo.x86_64 Used overlay2 driver with /mnt/docker as mount point foe docker data. /mnt/docker is separate ebs volume with xfs filesystem with ftype=1 docker deamon start command :

/bin/dockerd -H unix:///var/run/docker.sock -H tcp://0.0.0.0:2375 -g /mnt/docker --bip=172.17.0.1/24 --dns=172.17.0.1 --storage-driver=overlay2 --log-level=info --cluster-store=etcd://10.0.10.10:2379 --insecure-registry docker-registry.service.local:5040 --log-driver=syslog

Output of docker version:

Client:
 Version:      17.04.0-ce-rc1
 API version:  1.28
 Go version:   go1.7.5
 Git commit:   d2532c6
 Built:        Thu Mar 16 06:57:09 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.04.0-ce-rc1
 API version:  1.28 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   d2532c6
 Built:        Thu Mar 16 06:57:09 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 16
 Running: 16
 Paused: 0
 Stopped: 0
Images: 16
Server Version: 17.04.0-ce-rc1
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: syslog
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: 
containerd version: N/A (expected: 422e31ce907fd9c3833a38d7b8fdd023e5a76e73)
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.8.3-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 14.69GiB
Name: mcoverlay2-worker-1
ID: S4ID:SSDI:C6RJ:NZUO:JTBE:4X4O:4364:OTJD:INWW:OCE2:4DBF:H3WM
Docker Root Dir: /mnt/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Cluster Store: etcd://10.0.10.10:2379
Insecure Registries:
 docker-registry.service.local:5040
 127.0.0.0/8
Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 39 (27 by maintainers)

Most upvoted comments

@mlaventure I figured out the command is little different docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc --debug

This is the output :

sh-4.2# docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc --debug
DEBU[0000] containerd: grpc api on /var/run/docker/libcontainerd/docker-containerd.sock 
DEBU[0000] containerd: read past events                  count=635
DEBU[0000] containerd: container restored                id=04e13d5fca05ae3e2a856fe961425607698b927d21fe4a57d64eb3c69fd9f10f
ERRO[0000] containerd: notify OOM events                 error="open /proc/27840/cgroup: no such file or directory"
DEBU[0000] containerd: container restored                id=1cd0532e331134414bc90b96559a0574999d32c536586c5e4f4d3da606ec3208
DEBU[0000] containerd: container restored                id=57b92c0a2b58e0047291193992a66b7440e0bec5351e292762983423c199337c
DEBU[0000] containerd: container restored                id=830ae61604e4f71983255957aa78e405f9498137ed8c0c22142eaa7a398ce81a
DEBU[0000] containerd: container restored                id=8338ba04cdba13534c8145c15bc6ce088275dbfc69f14749d3da04b782b3c573
DEBU[0000] containerd: container restored                id=8cae52365d938e5a56d4d96efc3a72721e80dc11be98b17fb18727c2faef7b5e
DEBU[0000] containerd: container restored                id=b49feae95d3715154b51edd348e0e511ea274ffa7eb2a21c780e7b0f3c7d8d0b
DEBU[0000] containerd: container restored                id=b65f153652038b6d7aadfc8849f738071ee973f64bf20e138bc6f3cd11e0f405
DEBU[0000] containerd: container restored                id=b9e9837fb44c5d566b8817b75d4e4101db5e07dd017b8322b1c7c7abcaf68971
DEBU[0000] containerd: container restored                id=cd95305e1610881cdd651658a198c96a696ad0559b40bf2dde3714e25465082e
DEBU[0000] containerd: container restored                id=d4adc2ebc0250f713da301765c7ecb1302a8ad9ba89fe77cc06465e035591676
DEBU[0000] containerd: supervisor running                cpus=4 memory=15037 runtime=docker-runc runtimeArgs=[] stateDir="/var/run/docker/libcontainerd/containerd"
DEBU[0000] containerd: process exited                    id=1cd0532e331134414bc90b96559a0574999d32c536586c5e4f4d3da606ec3208 pid=339f76275ddc3c6ad66f504eadb2de1e0f85ad5caf6d54382088948d7e4d4b1d status=1 systemPid=28164
DEBU[0000] containerd: process exited                    id=1cd0532e331134414bc90b96559a0574999d32c536586c5e4f4d3da606ec3208 pid=init status=1 systemPid=27840
DEBU[0000] truncating event log                         
DEBU[0000] containerd: process exited                    id=b49feae95d3715154b51edd348e0e511ea274ffa7eb2a21c780e7b0f3c7d8d0b pid=8969c843016f2156b09dfaf4354dd15cad374bd456c75717cfb77d02df446849 status=1 systemPid=28119
DEBU[0000] containerd: process exited                    id=b65f153652038b6d7aadfc8849f738071ee973f64bf20e138bc6f3cd11e0f405 pid=c013e4979b056e118568411678b60e3152f0a02d8693c26e0a91a0c02f638932 status=1 systemPid=28050
DEBU[0007] containerd: process exited                    id=2b793bae0e2ead1d4c938d5624e48e47573d9a7707b0ea610f8e0bfee1fc7d6b pid=init status=1 systemPid=7734


DEBU[0042] containerd: process exited                    id=a195800f6c818c2e9827c88c5eef4d7ef2cd093825b9ed0cf147b239a5d0cb12 pid=init status=1 systemPid=8906
^CINFO[0042] stopping containerd after receiving interrupt 
sh-4.2#