moby: Docker daemon crashes if fluentd daemon is gone and buffer is full
Docker daemon crashes if
- container is configured to log to fluentd
- fluentd server shuts down (network problem or planned shutdown)
- buffer overflows
- Configure logstash with fluentd input on some host
- run
docker run -d --log-driver=fluentd --log-opt fluentd-address=172.17.X.X:4000 --log-opt tag="test" --log-opt fluentd-buffer-limit=10KB --log-opt fluentd-max-retries=2 --name test1 --rm busybox /bin/sh -c 'yes "crashme" '
Yes, limits are on purpose low to expose Docker behavior.
- make sure that Logstash receives messages.
- stop Logstash
- on docker host
watch 'docker ps'
shows in matter of seconds that docker daemon is not available.
Describe the results you received:
Dockerd process crashed with segfault
....repeating lines
time="2017-04-12T17:01:47.079303891Z" level=error msg="Failed to log msg \"crashme\" for logger fluentd: fluent#appendBuffer: Buffer full, limit 10240"
time="2017-04-12T17:01:47.079324142Z" level=error msg="Failed to log msg \"crashme\" for logger fluentd: fluent#appendBuffer: Buffer full, limit 10240"
panic: fluent#reconnect: failed to reconnect!
goroutine 29272 [running]:
panic(0x1630320, 0xc421062010)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/docker/docker/vendor/github.com/fluent/fluent-logger-golang/fluent.(*Fluent).reconnect(0xc42096a790)
/root/rpmbuild/BUILD/docker-ce/.gopath/src/github.com/docker/docker/vendor/github.com/fluent/fluent-logger-golang/fluent/fluent.go:276 +0xf8
created by github.com/docker/docker/vendor/github.com/fluent/fluent-logger-golang/fluent.(*Fluent).send
/root/rpmbuild/BUILD/docker-ce/.gopath/src/github.com/docker/docker/vendor/github.com/fluent/fluent-logger-golang/fluent/fluent.go:290 +0x136
Describe the results you expected:
Container should terminate, but dockerd should stay alive. Otherwise it seems as very easy DoS to crash docker on any host you have access.
Additional information you deem important (e.g. issue happens only occasionally): 100% reproducible with settings above.
Output of docker version
:
Client:
Version: 17.03.1-ce
API version: 1.27
Go version: go1.7.5
Git commit: c6d412e
Built: Mon Mar 27 17:05:44 2017
OS/Arch: linux/amd64
Server:
Version: 17.03.1-ce
API version: 1.27 (minimum version 1.12)
Go version: go1.7.5
Git commit: c6d412e
Built: Mon Mar 27 17:05:44 2017
OS/Arch: linux/amd64
Experimental: false
Output of docker info
:
Containers: 17
Running: 1
Paused: 0
Stopped: 16
Images: 3
Server Version: 17.03.1-ce
Storage Driver: overlay
Backing Filesystem: xfs
Supports d_type: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-514.6.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 14.69 GiB
ID: 5BXG:QZRZ:L2PI:QVJO:LTFG:JUPJ:ZABK:7LBA:D2G7:WR7K:EO65:SPSH
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
GCE n1-standard-4 instance
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 16 (7 by maintainers)
Commits related to this issue
- vendor: github.com/fluent/fluent-logger-golang 1.6.1 Updates the fluent logger library. Namely this fixes a couple places where the library could panic when closing and writing to channels. see http... — committed to sparrc/moby by sparrc 3 years ago
- vendor: github.com/fluent/fluent-logger-golang 1.6.1 Updates the fluent logger library. Namely this fixes a couple places where the library could panic when closing and writing to channels. see http... — committed to PettitWesley/moby by sparrc 3 years ago
I’ve run into this on Docker 18+ and had to use these options to stop the docker daemon crashing:
Is there a better recommended solution thus far ?
Hi, I am currently facing this problem on a multi-tenant swarm cluster i’m operating. How have your thinkings evolved about this issue? AMHO, “no requested logging” => “no running software” approach cannot apply in my case as each tenant sends its logs to services external to the cluster and I just cannot have docker daemon crash because of that. Until a satisfaying approach is found, would it just be possible to accept -1 value in the fluentd-max-retries parameter (which is compliant with fluent-logger-golang), that would let the operator decide the behaviour of docker daemon in that case?