moby: Docker daemon is leaking file descriptors (with reproduction)

Description

Not sure this is related, but I have in production after a few days / weeks that dockerd starts leaking tons of file descriptors of dead connections that it simply stops responding at all (not even to SIGUSR1).

While trying to understand how this can happen (since when it happens I can’t do anything but kill it), I tried to figure out how you get a dead connection, and have found some conclusions on how to get dockerd to leak such file descriptors, although I’m not sure that what I’m about to say is the issue that is happening in production.

How do those leaked file descriptors look ? I run the command on linux: ss | grep docker.sock and I see tons of entries that most look like this: u_str ESTAB 0 0 /var/run/docker.sock 272245 * 0
u_str ESTAB 0 0 /var/run/docker.sock 279780 * 0
u_str ESTAB 0 0 /var/run/docker.sock 279118 * 0
u_str ESTAB 0 0 /var/run/docker.sock 272201 * 0
u_str ESTAB 0 0 /var/run/docker.sock 272217 * 0

You can see this are dead connections because the peer info (the * 0 in the end) points to nothing which means the other side closed the unix domain socket, but the docker daemon did not.

also this sockets take up file descriptors (as expected): ls -l /proc/$(pidof dockerd)/fd … lrwx------ 1 root root 64 may 31 08:37 33 -> socket:[272217] …

an alive connection looks like this (there is peer info): u_str ESTAB 0 0 /var/run/docker.sock 280489 * 279381

I’m going now to show you a convoluted way to cause this resource leak, but remember that still dockerd shouldn’t leak resources and that in the end there might be similar code that really needs to do something like this…

This also seems to happen on variety of docker versions from old to new.

Steps to reproduce the issue:

docker create -it --rm python:2.7 python -c “while True: print 1;” <container id>
docker start <container id> <container id>
python2

import socket
s = socket.socket(socket.AF_UNIX)
s.connect('/var/run/docker.sock')
s.send('POST /v1.24/containers/<container id>/attach?logs=1&stream=0&stdout=1 HTTP/1.1\r\nHost: localhost\r\n\r\n')
s.recv(1024)
s.close()

You can also try changing the logs=1&stream=0 with logs=0&stream=1 Also don’t think the v1.24 is of particulate importance. Also you can kill the python instead of s.close() (sometimes it acts differently)

ss | grep docker.sock u_str ESTAB 0 0 /var/run/docker.sock 280489 * 0

Describe the results you received:

There is a dead unclosed unix domain socket on the docker daemon side.

Describe the results you expected:

The docker daemon should figure the socket has been closed on the other side, and close it’s side as well.

Versions

Tried on multiple versions, now both client and server: 17.12.1-ce and on ubuntu 16.04.4 with kernel 4.13.0-36-generic with default settings.

About this issue

Original URL
State: open
Created 6 years ago
Comments: 38 (23 by maintainers)

Most upvoted comments

dockerd version 18.09.1 containerd version: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce

Find fd leaking in running server, and the stack shows all goroutines waiting for container stopped. However, most of containers waited are already stopped.

Containers: 432
 Running: 402
 Paused: 0
 Stopped: 30

# ss -xp > ss.0313.log
# cat ss.0313.log | grep 'docker.sock' |  grep '* 0' | wc -l
42699

# cat /var/run/docker/goroutine-stacks-2020-03-13T153948+0800.log | grep '(*Daemon).Kill' | wc -l
42747

goroutine 45018 [chan receive, 5940 minutes]:
github.com/docker/docker/daemon.(*Daemon).Kill(0xc4208281e0, 0xc42164a000, 0x0, 0x0)
        /go/src/github.com/docker/docker/daemon/kill.go:162 +0x215
github.com/docker/docker/daemon.(*Daemon).containerStop(0xc4208281e0, 0xc42164a000, 0x1e, 0x0, 0x0)
        /go/src/github.com/docker/docker/daemon/stop.go:80 +0x2b2
github.com/docker/docker/daemon.(*Daemon).ContainerStop(0xc4208281e0, 0xc424e4c397, 0x40, 0xc4225db190, 0xc425a9f4d8, 0xc424b502c0)
        /go/src/github.com/docker/docker/daemon/stop.go:33 +0xc7
github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersStop(0xc4234a9180, 0x5572dedad960, 0xc427dc5860, 0x5572dedabd60, 0xc423118d20, 0xc429611800, 0xc427dc5740, 0x5572de11ecfb, 0x5)
        /go/src/github.com/docker/docker/api/server/router/container/container_routes.go:225 +0xf2
github.com/docker/docker/api/server/router/container.(*containerRouter).(github.com/docker/docker/api/server/router/container.postContainersStop)-fm(0x5572dedad960, 0xc427dc5860, 0x5572dedabd60, 0xc423118d20, 0xc429611800, 0xc427dc5740, 0x5572dcc5f7cc, 0x5572debe98e0)
        /go/src/github.com/docker/docker/api/server/router/container/container.go:53 +0x6b
github.com/docker/docker/api/server/middleware.ExperimentalMiddleware.WrapHandler.func1(0x5572dedad960, 0xc427dc5860, 0x5572dedabd60, 0xc423118d20, 0xc429611800, 0xc427dc5740, 0x5572dedad960, 0xc427dc5860)
        /go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0xda
github.com/docker/docker/api/server/middleware.VersionMiddleware.WrapHandler.func1(0x5572dedad960, 0xc427dc5800, 0x5572dedabd60, 0xc423118d20, 0xc429611800, 0xc427dc5740, 0x18, 0xc425fa58f8)
        /go/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x401
github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1(0x5572dedad960, 0xc427dc5800, 0x5572dedabd60, 0xc423118d20, 0xc429611800, 0xc427dc5740, 0x5572dcb282d7, 0x5572de2f3e20)
        /go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x7ab
github.com/docker/docker/api/server/middleware.DebugRequestMiddleware.func1(0x5572dedad960, 0xc427dc5800, 0x5572dedabd60, 0xc423118d20, 0xc429611800, 0xc427dc5740, 0x5572dedad960, 0xc427dc5800)
        /go/src/github.com/docker/docker/api/server/middleware/debug.go:25 +0x8db
github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1(0x5572dedabd60, 0xc423118d20, 0xc429611800)
        /go/src/github.com/docker/docker/api/server/server.go:141 +0x19a
net/http.HandlerFunc.ServeHTTP(0xc422154b60, 0x5572dedabd60, 0xc423118d20, 0xc429611800)
        /usr/local/go/src/net/http/server.go:1947 +0x46
github.com/docker/docker/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc42295e910, 0x5572dedabd60, 0xc423118d20, 0xc429611800)
        /go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:103 +0x228
github.com/docker/docker/api/server.(*routerSwapper).ServeHTTP(0xc421d0e8e0, 0x5572dedabd60, 0xc423118d20, 0xc429611800)
        /go/src/github.com/docker/docker/api/server/router_swapper.go:29 +0x72
net/http.serverHandler.ServeHTTP(0xc420ab7930, 0x5572dedabd60, 0xc423118d20, 0xc429611800)
        /usr/local/go/src/net/http/server.go:2697 +0xbe
net/http.(*conn).serve(0xc4221c15e0, 0x5572dedad8a0, 0xc423165ac0)
        /usr/local/go/src/net/http/server.go:1830 +0x653
created by net/http.(*Server).Serve
        /usr/local/go/src/net/http/server.go:2798 +0x27d

richard-tai on Mar 13, 2020

Server Version: 19.03.6 containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 init version: fec3683

u_str  ESTAB      0      0      /var/run/docker.sock 924112                * 0                     users:(("dockerd",pid=21170,fd=25))
u_str  ESTAB      0      0      /var/run/docker.sock 892325                * 0                     users:(("dockerd",pid=21170,fd=19))

After reproducton, hung when try to remove the container forcely or restart the docker service.

docker rm -f <container-id>
systemctl restart docker.service

Only fixed by

 kill -9 $(pidof containerd-shim)

As the stack shows, some deadlock happened.

Two new generated goroutine trying to lock (0xc000dc9000)

goroutine 423 [semacquire, 74 minutes]:
  sync.runtime_SemacquireMutex(0xc000dc9004, 0x5617d98c1400)
          /usr/local/go/src/runtime/sema.go:71 +0x3f
  sync.(*Mutex).Lock(0xc000dc9000)
          /usr/local/go/src/sync/mutex.go:134 +0x10b
  github.com/docker/docker/pkg/broadcaster.(*Unbuffered).Add(0xc000dc9000, 0x5617d9ade600, 0xc000438180)
          /go/src/github.com/docker/docker/pkg/broadcaster/unbuffered.go:16 +0x33
  github.com/docker/docker/container/stream.(*Config).StdoutPipe(0xc000e72140, 0xc0006d6301, 0xc000dfe240)
          /go/src/github.com/docker/docker/container/stream/streams.go:70 +0x50
  github.com/docker/docker/container/stream.(*Config).AttachStreams(0xc000e72140, 0xc000dfe240)
          /go/src/github.com/docker/docker/container/stream/attach.go:51 +0xaa
  github.com/docker/docker/daemon.(*Daemon).ContainerAttach(0xc00000c3c0, 0xc0004ce097, 0x40, 0xc000a46300, 0x0, 0x0)
          /go/src/github.com/docker/docker/daemon/attach.go:51 +0x459
  github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersAttach(0xc0009cde80, 0x5617d9b0c080, 0xc000a46240, 0x5617d9afb380, 0xc00038c000, 0xc000050500, 0xc000a460f0, 0xc000160a01, 0
          /go/src/github.com/docker/docker/api/server/router/container/container_routes.go:602 +0x357
  github.com/docker/docker/api/server/middleware.ExperimentalMiddleware.WrapHandler.func1(0x5617d9b0c080, 0xc000a46240, 0x5617d9afb380, 0xc00038c000, 0xc000050500, 0xc000a460f0, 0x5617d9b0c080, 0xc000a46240)
          /go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0x17b
  github.com/docker/docker/api/server/middleware.VersionMiddleware.WrapHandler.func1(0x5617d9b0c080, 0xc000a46180, 0x5617d9afb380, 0xc00038c000, 0xc000050500, 0xc000a460f0, 0x5617d7095cf5, 0xc0001da0c0)
          /go/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x60d
  github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1(0x5617d9b0c080, 0xc000a46180, 0x5617d9afb380, 0xc00038c000, 0xc000050500, 0xc000a460f0, 0xc0001da0c0, 0x5617d97ab701)
          /go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x82f
  github.com/docker/docker/api/server/middleware.DebugRequestMiddleware.func1(0x5617d9b0c080, 0xc000a46180, 0x5617d9afb380, 0xc00038c000, 0xc000050500, 0xc000a460f0, 0x5617d9b0c080, 0xc000a46180)
          /go/src/github.com/docker/docker/api/server/middleware/debug.go:53 +0x4a4
  github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1(0x5617d9afb380, 0xc00038c000, 0xc000050400)
          /go/src/github.com/docker/docker/api/server/server.go:142 +0x225
  net/http.HandlerFunc.ServeHTTP(0xc000c965a0, 0x5617d9afb380, 0xc00038c000, 0xc000050400)
          /usr/local/go/src/net/http/server.go:1995 +0x46
  github.com/docker/docker/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc0005a0a80, 0x5617d9afb380, 0xc00038c000, 0xc00059c100)
          /go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:212 +0xe5
  github.com/docker/docker/api/server.(*routerSwapper).ServeHTTP(0xc000e639d0, 0x5617d9afb380, 0xc00038c000, 0xc00059c100)
          /go/src/github.com/docker/docker/api/server/router_swapper.go:29 +0x72
  net/http.serverHandler.ServeHTTP(0xc0007dac30, 0x5617d9afb380, 0xc00038c000, 0xc00059c100)
          /usr/local/go/src/net/http/server.go:2774 +0xaa
  net/http.(*conn).serve(0xc000e340a0, 0x5617d9b0bfc0, 0xc0009cc100)
          /usr/local/go/src/net/http/server.go:1878 +0x853
  created by net/http.(*Server).Serve
          /usr/local/go/src/net/http/server.go:2884 +0x2f6

  goroutine 467 [semacquire, 4 minutes]:
  sync.runtime_SemacquireMutex(0xc000dc9004, 0x5617d98c1400)
          /usr/local/go/src/runtime/sema.go:71 +0x3f
  sync.(*Mutex).Lock(0xc000dc9000)
          /usr/local/go/src/sync/mutex.go:134 +0x10b
  github.com/docker/docker/pkg/broadcaster.(*Unbuffered).Add(0xc000dc9000, 0x5617d9ade600, 0xc000438380)
          /go/src/github.com/docker/docker/pkg/broadcaster/unbuffered.go:16 +0x33
  github.com/docker/docker/container/stream.(*Config).StdoutPipe(0xc000e72140, 0xc0006d6301, 0xc000dfe510)
          /go/src/github.com/docker/docker/container/stream/streams.go:70 +0x50
  github.com/docker/docker/container/stream.(*Config).AttachStreams(0xc000e72140, 0xc000dfe510)
          /go/src/github.com/docker/docker/container/stream/attach.go:51 +0xaa
  github.com/docker/docker/daemon.(*Daemon).ContainerAttach(0xc00000c3c0, 0xc0004ce117, 0x40, 0xc000988450, 0x0, 0x0)
          /go/src/github.com/docker/docker/daemon/attach.go:51 +0x459
  github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersAttach(0xc0009cde80, 0x5617d9b0c080, 0xc000988390, 0x5617d9afb380, 0xc00038c0e0, 0xc000050700, 0xc0009881b0, 0xc0006de001, 0
          /go/src/github.com/docker/docker/api/server/router/container/container_routes.go:602 +0x357
  github.com/docker/docker/api/server/middleware.ExperimentalMiddleware.WrapHandler.func1(0x5617d9b0c080, 0xc000988390, 0x5617d9afb380, 0xc00038c0e0, 0xc000050700, 0xc0009881b0, 0x5617d9b0c080, 0xc000988390)
          /go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0x17b
  github.com/docker/docker/api/server/middleware.VersionMiddleware.WrapHandler.func1(0x5617d9b0c080, 0xc0009882a0, 0x5617d9afb380, 0xc00038c0e0, 0xc000050700, 0xc0009881b0, 0x5617d7095cf5, 0xc0001da260)
          /go/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x60d
  github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1(0x5617d9b0c080, 0xc0009882a0, 0x5617d9afb380, 0xc00038c0e0, 0xc000050700, 0xc0009881b0, 0xc0001da260, 0x5617d97ab701)
          /go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x82f
  github.com/docker/docker/api/server/middleware.DebugRequestMiddleware.func1(0x5617d9b0c080, 0xc0009882a0, 0x5617d9afb380, 0xc00038c0e0, 0xc000050700, 0xc0009881b0, 0x5617d9b0c080, 0xc0009882a0)
          /go/src/github.com/docker/docker/api/server/middleware/debug.go:53 +0x4a4
  github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1(0x5617d9afb380, 0xc00038c0e0, 0xc000050600)
          /go/src/github.com/docker/docker/api/server/server.go:142 +0x225
  net/http.HandlerFunc.ServeHTTP(0xc000c965a0, 0x5617d9afb380, 0xc00038c0e0, 0xc000050600)
          /usr/local/go/src/net/http/server.go:1995 +0x46
  github.com/docker/docker/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc0005a0a80, 0x5617d9afb380, 0xc00038c0e0, 0xc000050300)
          /go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:212 +0xe5
  github.com/docker/docker/api/server.(*routerSwapper).ServeHTTP(0xc000e639d0, 0x5617d9afb380, 0xc00038c0e0, 0xc000050300)
          /go/src/github.com/docker/docker/api/server/router_swapper.go:29 +0x72
  net/http.serverHandler.ServeHTTP(0xc0007dac30, 0x5617d9afb380, 0xc00038c0e0, 0xc000050300)
          /usr/local/go/src/net/http/server.go:2774 +0xaa
  net/http.(*conn).serve(0xc000e5e000, 0x5617d9b0bfc0, 0xc0004380c0)
          /usr/local/go/src/net/http/server.go:1878 +0x853
  created by net/http.(*Server).Serve
          /usr/local/go/src/net/http/server.go:2884 +0x2f6

This goroutine already existed before reproduction. address (0xc000dc9000) used when pool.Copy.

goroutine 438 [sync.Cond.Wait, 82 minutes]:
  runtime.goparkunlock(...)
          /usr/local/go/src/runtime/proc.go:307
  sync.runtime_notifyListWait(0xc000e6a150, 0x561700000000)
          /usr/local/go/src/runtime/sema.go:510 +0xfd
  sync.(*Cond).Wait(0xc000e6a140)
          /usr/local/go/src/sync/cond.go:56 +0xa0
  github.com/docker/docker/pkg/ioutils.(*BytesPipe).Write(0xc000e6a100, 0xc000cee009, 0x72, 0x7ff7, 0x7b, 0x0, 0x0)
          /go/src/github.com/docker/docker/pkg/ioutils/bytespipe.go:89 +0xad
  github.com/docker/docker/pkg/broadcaster.(*Unbuffered).Write(0xc000dc9000, 0xc000cee000, 0x7b, 0x8000, 0x7b, 0x0, 0x0)
          /go/src/github.com/docker/docker/pkg/broadcaster/unbuffered.go:27 +0xdf
  io.copyBuffer(0x5617d9ab0f40, 0xc000dc9000, 0x7ffb16104028, 0xc000097560, 0xc000cee000, 0x8000, 0x8000, 0x5438312d34302d37, 0x0, 0x7ffb16104028)
          /usr/local/go/src/io/io.go:404 +0x1fd
  io.CopyBuffer(...)
          /usr/local/go/src/io/io.go:375
  github.com/docker/docker/pkg/pools.Copy(0x5617d9ab0f40, 0xc000dc9000, 0x7ffb16104028, 0xc000097560, 0xc000097560, 0x1, 0x5617d70bfdc8)
          /go/src/github.com/docker/docker/pkg/pools/pools.go:81 +0xd1
  github.com/docker/docker/container/stream.(*Config).CopyToPipe.func1.1(0x5617d9ab0f40, 0xc000dc9000, 0x7ffb16103f68, 0xc000097560, 0xc000e72140)
          /go/src/github.com/docker/docker/container/stream/streams.go:124 +0x73
  created by github.com/docker/docker/container/stream.(*Config).CopyToPipe.func1
          /go/src/github.com/docker/docker/container/stream/streams.go:123 +0x86

richard-tai on Mar 13, 2020

@dannyk81 it looks like https://github.com/moby/moby/issues/37182#issuecomment-509681877 covers most of what I was about to say, hope it helps and please feel free to follow up with any questions. that repo’s “overlay-runner” does not suffer from this defect on any version of docker.

erikh on Nov 27, 2019