docker-py: docker-py running container blocks access to "docker stats"
I am using docker-py to run a short-running (1m-2h) container in the background and I’d like to monitor that container’s stats to record and characterize its I/O, CPU, and memory usage every second or two. But it appears that while containers are running under docker-py I can’t get stats, either from my code through docker-py (c.stats()
) or the command line docker stats
. Streaming or no streaming, both seem to just hang until they timeout, or the container just happens to finish and unblock docker.
I should note that I can call other functions on the docker-py Client, such as containers()
. Just that the stats()
call with no streaming or attempting to iterate over the generator with streaming just hangs until the container exits. Similar story with the command line. docker ps
works, docker stats
hangs.
I’m using Ubuntu 14.04, docker-py version 1.9.0, and docker version 1.12.1, plain vanilla from the repos, connecting over a local domain socket as configured by default (/var/run/docker.sock).
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 1
- Comments: 19 (7 by maintainers)
So this is actually fixed in upstream: https://github.com/docker/docker/pull/25905
The problem is that container has disabled networking (
SandboxID
is set to""
) and stat collector fails to pick networking stats (obviously) and won’t publish any stats to channel, meaning server will never respond with any stats.To workaround this issue: enable networking. To fix this issue: wait for upstream to release a new version of docker (> 1.12.1).
Wait!
{'network_disabled' : False}
did fix my problem! But it was masked by another important problem that might be related the source of the bug. In my code I iterate until the container disappears from the list of running containers:With
{'network_disabled' : False}
that loops correctly until the container is no longer running. Then, and I suspect there’s a race condition here, what should be the final call tostats()
hangs. My solution is to set the client timeout to something smaller and less annoying like 15 seconds and do this:@michaelbarton: yeah, definitely in my code there’s a race condition but I’d expect a graceful failure (ie. stats of 0). I was thinking that I was actually racing against a bug in docker where trying to get stats of a stopped container hung. Which according to @TomasTomecek might be the case!
For anyone else tripping over this while we wait for docker to get fixed, I’m using the generator, streaming version of
stats
since it’s a bit cleaner (still have to catch the final timeout):I think there is a chance of a race condition between checking whether a container is still running and then subsequently trying to get the Docker stats for it. An alternative might be to break if the container is no longer running, and just
pass
in the except block, I think otherwise there could be the chance that is there really is a network timeout error and you stop collecting metrics from the container when it is still running.I mention this because I spent all of yesterday struggling with the same problem. https://github.com/bioboxes/bioboxes-py/blob/master/biobox/cgroup.py