moby: docker container won't stop
I cannot force a container to stop:
$ sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e732663decf0 sif-gpu "./train.sh" About an hour ago Up About an hour gifted_brown
3c2d52a9dbae aa51bb346558 "./train.sh" About an hour ago Up About an hour romantic_haibt
9263207753ef sockeye-gpu "./train.sh" 2 hours ago Up 2 hours jovial_northcutt
994875c39514 sockeye-gpu "train.sh" 16 hours ago Created wizardly_borg
fbbad3f7140c 3ae77fec5f41 "train.sh" 16 hours ago Created relaxed_darwin
$ sudo docker stop 3c2d52a9dbae
3c2d52a9dbae
$ sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e732663decf0 sif-gpu "./train.sh" About an hour ago Up About an hour gifted_brown
3c2d52a9dbae aa51bb346558 "./train.sh" About an hour ago Up About an hour romantic_haibt
9263207753ef sockeye-gpu "./train.sh" 2 hours ago Up 2 hours jovial_northcutt
994875c39514 sockeye-gpu "train.sh" 16 hours ago Created wizardly_borg
fbbad3f7140c 3ae77fec5f41 "train.sh" 16 hours ago Created relaxed_darwin
and
$ sudo docker rm 3c2d52a9dbae
Error response from daemon: You cannot remove a running container 3c2d52a9dbaefe37233d0ac411955f7f37ccc5ee16843e34dd42074c98417441. Stop the container before attempting removal or use -f
where the image is being used by the hanging container
$ sudo docker rmi aa51bb346558
Error response from daemon: conflict: unable to delete aa51bb346558 (cannot be forced) - image is being used by running container 3c2d52a9dbae
After several attempts I get a device or resource busy
error.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 43 (14 by maintainers)
The more recent reports of this are new issues related to #35933 I’m going to close this as it’s stale and newer issue is tracked in the mentioned issue.
Thanks! 👼 🙇
Experiencing the same issue on Arch linux.
sudo systemctl stop docker
does not kill thedocker-containerd-shim
process.Hard to tell; I don’t think there has been a way to reproduce the issue, so without that, it’s not possible to say if things changed (or not). Docker 17.11 and 17.12 now use the containerd 1.0 runtime, with lots of enhancements, so if you have a reproducible case (and an environment for testing), it could be worth to check if it still reproduces for you on 17.12
Same issue as @KramKroc. When a process is killed by the kernel (
memory cgroup out of memory
, which is what I was trying to test) inside of my container, not even the entry point, the daemon respond with that error when trying to stop the container:EDIT: Hmm, my entry point was also killed, so the “not found” make sens actually. Don’t know why it was killed however and why it was not trying to restart.
We’re seeing the same issue as well with 17.05-ce:
Chiming in here to say I’m currently running into this exact same issue. When I
strace
the stranded container process it’s stuck in a loop that looks like the following:@loretoparisi no, it can be enabled/disabled without restarting the daemon, e.g. (assuming you don’t have a
daemon.json
yet;@loretoparisi if you want to use tini for a container; it’s included with docker; if you start a container with
--init
, then tini is automatically inserted in the container;Start two containers; one with, and one without the
--init
option:Check the output of
docker container top
for each container;You can also set the
"init": true
option for the daemon configuration file, in which case it will be added by default for every container that’s started; see daemon configuration fileI run the container again, the dockerd rises up to 200% cpu again.
I called docker stats at 16:44:31 and I started seeing errors in the docker log immediately, it didn’t manage to display any stats.
Unfortunately I had to stop and start the docker service because I couldn’t afford to have that computer locked, I didn’t check the status of containerd. I’ll keep an eye and if it happens again I’ll update this.
Some additional info on the issue we spotted and a possible pointer to the root cause (well, for our system anyway). We’re running a number of containers, majority of which run a java process. I was checking the message log and spotted occasionally a entry like this:
I checked with ps to see if that process was running anymore:
I then did a check on the PIDs from a docker perspective:
If you do a docker ps, you see all processes are listed as running, with up time of hours. When I try to restart one of the docker containers associated with the killed process, if appears to work until you view docker ps, where it shows as running with again an up time of hours.
More worryingly is that when you try to restart a process that is actually running, that it too doesn’t show show any downtime and shows itself as running, but in fact the process has been killed:
And the message log shows something like the following:
We had to restart the docker daemon, which killed and restarted the docker container processes too.
We are experiencing probably the same problem. After stopping a bunch of containers at the same time, we noticed a week later that on some hosts there were containers still “running”.
Other symptoms are similar:
docker stop
does not return an error (however writes container not found in the log),docker top
says “container not found”.docker inspect
states that the container would still be running, however the process ID does not exist.After the container is stopped:
You could also do a
sudo docker inspect 3c2d52a9dbae
to get the output of the current state of the container. That may shed more light on if the process is actually running.