colima: v0.6.x: "API error (500): Could not kill running container" "container 8d67c83751e6 PID 19180 is zombie and can not be killed"

Description

With DDEV I’ve never seen this before in previous versions of Colima, but I’ve now seen it and it’s been reported by others. On ddev stop:

API error (500): Could not kill running container 8d67c83751e67ee98571cb7c2b64794c30419ef9c9496791d34876dbd040dda3, cannot remove - container 8d67c83751e6 PID 19180 is zombie and can not be killed. Use the --init option when creating containers to run an init inside the container that forwards signals and reaps processes

Version

Colima Version: 0.6.1 Lima Version: 0.18.0 Qemu Version:

Operating System

  • macOS Intel <= 12 (Monterrey)
  • macOS Intel >= 13 (Ventura)
  • macOS M1 <= 12 (Monterrey)
  • macOS M1 >= 13 (Ventura)
  • Linux

Output of colima status

INFO[0000] colima is running using QEMU INFO[0000] arch: aarch64 INFO[0000] runtime: docker INFO[0000] mountType: sshfs INFO[0000] socket: unix:///Users/rfay/.colima/default/docker.sock

Reproduction Steps

It doesn’t happen every time, but does happen periodically on ddev stop, which is mostly the equivalent of docker stop followed by docker rm

Expected behaviour

I didn’t see this ever before.

Additional context

No response

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 23 (19 by maintainers)

Most upvoted comments

I ran colima HEAD through 110 iterations of the breakit.sh script without failure. It was a new colima profile, which makes the test perhaps a little questionable, but it sure looked good.

Running full DDEV test suite now in https://github.com/ddev/ddev/pull/5549, that wasn’t able to succeed in v0.6.*, so fingers crossed now!

also no freezing on my end. and yes the message got displayed. even though it was called a zombie it doesnt behaved like one (except the container would be still present invisible to docker ps -astill). but here is the terminal output from yesterday. first the last few lines of ddev stop

2023-11-13T01:42:16.117 Paused Mutagen sync session 'tests' 
 Container ddev-tests-web  Stopped 
 Container ddev-tests-db  Stopped 
 Container ddev-tests-web  Stopped 
 Container ddev-tests-db  Stopped 
 Container ddev-tests-db  Removed 
 Container ddev-tests-web  Removed 
Failed to stop project tests: 
API error (500): Could not kill running container 104d003aba7d2319b995d504276301a76125c6242a82e053c46b67bdc8daa950, cannot remove - container 104d003aba7d PID 20554 is zombie and can not be killed. Use the --init option when creating containers to run an init inside the container that forwards signals and reaps processes 

and after the error took place i tested:

$> docker ps -a
CONTAINER ID   IMAGE                               COMMAND                  CREATED          STATUS                            PORTS     NAMES
104d003aba7d   ddev/ddev-traefik-router:v1.22.4    "/entrypoint.sh --co…"   11 minutes ago   Exited (137) About a minute ago             ddev-router
188539ad3944   ddev/ddev-ssh-agent:v1.22.4-built   "/entry.sh ssh-agent"    12 minutes ago   Up 12 minutes (healthy)                     ddev-ssh-agent

i then started the same project again. and after the successful start i did another:

$> docker ps -a
CONTAINER ID   IMAGE                                                 COMMAND                  CREATED              STATUS                        PORTS                                                                                                          NAMES
ddc7e1101c11   ddev/ddev-traefik-router:v1.22.4                      "/entrypoint.sh --co…"   About a minute ago   Up About a minute (healthy)   127.0.0.1:80->80/tcp, 127.0.0.1:443->443/tcp, 127.0.0.1:8025-8026->8025-8026/tcp, 127.0.0.1:10999->10999/tcp   ddev-router
971ef26ae07f   ddev/ddev-dbserver-mariadb-10.5:v1.22.4-tests-built   "/docker-entrypoint.…"   2 minutes ago        Up 2 minutes (healthy)        127.0.0.1:32789->3306/tcp                                                                                      ddev-tests-db
929e4f3722ec   ddev/ddev-webserver:20231108_new_lagoon-tests-built   "/pre-start.sh"          2 minutes ago        Up 2 minutes (healthy)        8025/tcp, 127.0.0.1:32788->80/tcp, 127.0.0.1:32787->443/tcp                                                    ddev-tests-web
188539ad3944   ddev/ddev-ssh-agent:v1.22.4-built                     "/entry.sh ssh-agent"    25 minutes ago       Up 25 minutes (healthy)                                                                                                                      ddev-ssh-agent

the exited container was removed and there was a new ddev-traefik-router container running instead.

Thanks for reporting this.

Does it freeze or it only displays the message before stopping the container?