podman: Rootless 'podman rm --force' fails with 'given PIDs did not die within timeout'
/kind bug
Description
Sometimes podman rm --force <container> fails to remove a running container that once had an active Exec session, but not anymore. Once it fails the first time, the container is marked as Exited, but podman rm --force continues to keep failing.
$ podman ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
025abd4217ba registry.fedoraproject.org/f30/fedora-toolbox:30 toolbox --verbose... 27 minutes ago Exited (143) 5 minutes ago fedora-toolbox-30
$ podman --log-level debug rm --force fedora-toolbox-30
INFO[0000] running as rootless
DEBU[0000] using conmon: "/usr/libexec/podman/conmon"
DEBU[0000] Initializing boltdb state at /var/home/rishi/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/home/rishi/.local/share/containers/storage
DEBU[0000] Using run root /tmp/1000
DEBU[0000] Using static dir /var/home/rishi/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp
DEBU[0000] Using volume path /var/home/rishi/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false
DEBU[0000] Initializing event backend journald
DEBU[0000] using runtime "/usr/bin/runc"
DEBU[0000] Setting maximum rm workers to 16
DEBU[0000] Killing all processes in container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 with SIGTERM
WARN[0000] no such directory for freezer.state
WARN[0000] no such directory for freezer.state
WARN[0010] Timed out stopping container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 exec sessions
DEBU[0010] Killing all processes in container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 with SIGKILL
WARN[0000] no such directory for freezer.state
WARN[0000] no such directory for freezer.state
DEBU[0015] Failed to remove container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7: failed to kill container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 exec sessions: given PIDs did not die within timeout
DEBU[0015] Worker#0 finished job [(*LocalRuntime) RemoveContainers func1]/025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 (failed to kill container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 exec sessions: given PIDs did not die within timeout)
DEBU[0015] Pool[rm, 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7: failed to kill container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 exec sessions: given PIDs did not die within timeout]
ERRO[0015] failed to kill container 025abd4217ba14e95ceea799cee9e29a0d446b76d08765b724371c3cc3ed67d7 exec sessions: given PIDs did not die within timeout
Additional information you deem important (e.g. issue happens only occasionally):
This doesn’t happen reliably, but every once in a while, but I believe I only started seeing it with podman-1.5.0.
Output of podman version:
Version: 1.5.0
RemoteAPI Version: 1
Go Version: go1.12.7
OS/Arch: linux/amd64
Output of podman info --debug:
debug:
compiler: gc
git commit: ""
go version: go1.12.7
podman version: 1.5.0
host:
BuildahVersion: 1.10.1
Conmon:
package: podman-1.5.0-2.fc30.x86_64
path: /usr/libexec/podman/conmon
version: 'conmon version 2.0.0, commit: 7e8f10c28723d67281b1dd11d5dac8edf29ca3d0-dirty'
Distribution:
distribution: fedora
version: "30"
MemFree: 9611743232
MemTotal: 16530231296
OCIRuntime:
package: runc-1.0.0-93.dev.gitb9b6cc6.fc30.x86_64
path: /usr/bin/runc
version: |-
runc version 1.0.0-rc8+dev
commit: e3b4c1108f7d1bf0d09ab612ea09927d9b59b4e3
spec: 1.0.1-dev
SwapFree: 8414818304
SwapTotal: 8414818304
arch: amd64
cpus: 4
eventlogger: journald
hostname: bollard
kernel: 5.2.7-200.fc30.x86_64
os: linux
rootless: true
uptime: 36m 35.57s
registries:
blocked: null
insecure: null
search:
- docker.io
- registry.fedoraproject.org
- quay.io
- registry.access.redhat.com
- registry.centos.org
store:
ConfigFile: /var/home/rishi/.config/containers/storage.conf
ContainerStore:
number: 1
GraphDriverName: overlay
GraphOptions:
- overlay.mount_program=/usr/bin/fuse-overlayfs
GraphRoot: /var/home/rishi/.local/share/containers/storage
GraphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
ImageStore:
number: 2
RunRoot: /tmp/1000
VolumePath: /var/home/rishi/.local/share/containers/storage/volumes
Additional environment details (AWS, VirtualBox, physical, etc.):
I have only seen this happen on Fedora 30 hosts, possibly because I have only tried podman-1.5.0 on Fedora 30.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 24 (17 by maintainers)
I can pick this one with https://github.com/containers/libpod/issues/5014. I believe to have a fix.