podman: `podman stop` returns before removing a container started with `--rm`

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

After podman stop returns after removing a container, there are still processes running in the background responsible for removing the container if it was started with --rm. In combination with systemd, this can cause iptables NAT rules to be left behind. This is especially bad when ports are published/forwarded (not demonstrated below), which causes DNAT rules to be added. When these are left behind, the host port is then unusable for future containers because the original rule takes precedence.

Steps to reproduce the issue:

# iptables -t nat -S >clean.iptables
# cat >/etc/systemd/system/nattest.service
[Unit]
Description=Podman NAT Test

[Service]
ExecStart=podman run --rm --name nattest busybox nc -l
ExecStop=podman stop nattest
# systemctl start nattest
# diff clean.iptbales <(iptables -t nat -S) | grep '^[<>]'
> -N CNI-f891eda7ae68318dc3584dd8
> -A POSTROUTING -s 10.88.0.70/32 -m comment --comment "name: \"podman\" id: \"86cc7521c68b20a7c9f97dd4cd176c41e13008f986736be8056aca8cbb28dab8\"" -j CNI-f891eda7ae68318dc3584dd8
> -A CNI-f891eda7ae68318dc3584dd8 -d 10.88.0.0/16 -m comment --comment "name: \"podman\" id: \"86cc7521c68b20a7c9f97dd4cd176c41e13008f986736be8056aca8cbb28dab8\"" -j ACCEPT
> -A CNI-f891eda7ae68318dc3584dd8 ! -d 224.0.0.0/4 -m comment --comment "name: \"podman\" id: \"86cc7521c68b20a7c9f97dd4cd176c41e13008f986736be8056aca8cbb28dab8\"" -j MASQUERADE
# systemctl stop nattest
# diff clean.iptbales <(iptables -t nat -S) | grep '^[<>]'
> -N CNI-3b9afa83adb3440af4a15557
> -A POSTROUTING -s 10.88.0.71/32 -m comment --comment "name: \"podman\" id: \"74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e\"" -j CNI-3b9afa83adb3440af4a15557
> -A CNI-3b9afa83adb3440af4a15557 -d 10.88.0.0/16 -m comment --comment "name: \"podman\" id: \"74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e\"" -j ACCEPT
> -A CNI-3b9afa83adb3440af4a15557 ! -d 224.0.0.0/4 -m comment --comment "name: \"podman\" id: \"74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e\"" -j MASQUERADE
# journalctl -u nattest
Sep 01 06:41:42 systemd[1]: Started Podman NAT Test.
Sep 01 06:41:43 podman[50324]: 2021-09-01 06:41:43.027964497 +0200 CEST m=+0.161186034 image pull  busybox
Sep 01 06:41:43 podman[50324]: 2021-09-01 06:41:43.251304873 +0200 CEST m=+0.384526468 container create 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 06:41:43 podman[50324]: 2021-09-01 06:41:43.746840867 +0200 CEST m=+0.880062381 container init 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 06:41:43 podman[50324]: 2021-09-01 06:41:43.796867219 +0200 CEST m=+0.930088726 container start 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 06:41:43 podman[50324]: 2021-09-01 06:41:43.797241532 +0200 CEST m=+0.930463052 container attach 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 06:41:47 systemd[1]: Stopping Podman NAT Test...
Sep 01 06:41:47 podman[50324]: punt!
Sep 01 06:41:57 podman[50440]: 2021-09-01 06:41:57.658659241 +0200 CEST m=+10.271143781 container stop 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 06:41:57 podman[50440]: 2021-09-01 06:41:57.728147729 +0200 CEST m=+10.340632303 container died 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 06:41:57 podman[50440]: nattest
Sep 01 06:41:57 podman[50324]: time="2021-09-01T06:41:57+02:00" level=error msg="unable to cleanup network for container 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e: \"error tearing down CNI namespace configuration for container 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e: error removing pod nattest_nattest from CNI network \\\"podman\\\": netplugin failed with no error message: signal: terminated\""
Sep 01 06:41:58 podman[50324]: 2021-09-01 06:41:58.160361638 +0200 CEST m=+15.293583198 container remove 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 06:41:58 podman[50324]: time="2021-09-01T06:41:58+02:00" level=error msg="Error forwarding signal 15 to container 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e: container has already been removed"
Sep 01 06:41:58 systemd[1]: nattest.service: Deactivated successfully.
Sep 01 06:41:58 systemd[1]: Stopped Podman NAT Test.

Describe the results you received:

Podman failed to remove the iptables NAT rules.

There is an error message in the log:

unable to cleanup network for container 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e: “error tearing down CNI namespace configuration for container 74ff59b9b635231ebd12c12ae0f318e83166a6de8b653ad5e965dd7bdec7e72e: error removing pod nattest_nattest from CNI network \“podman\”: netplugin failed with no error message: signal: terminated”

The process responsible for the clean-up was presumably killed by systemd after docker stop returned.

Describe the results you expected:

I expected no error message and the NAT rules to be back to the same state as they were before starting the service.

Additional information you deem important (e.g. issue happens only occasionally):

At first I thought this was related to #11324 but after that one was fixed in Podman 3.3, I determined this is a separate issue. The symptoms are the same in my case.

The manifestation of the bug is a little random because it is based on a race condition. It is reproducible most of the time using the steps above.

Two possible work-arounds are to add either another ExecStop=sleep 1 or KillMode=none to the service definition. The first one gives Podman another second before systemd starts killing remaining processes. The latter disables killing altogether but is not a recommended option.

# cat >/etc/systemd/system/nattest-nokill.service
[Unit]
Description=Podman NAT Test

[Service]
ExecStart=podman run --rm --name nattest busybox nc -l
ExecStop=podman stop nattest
KillMode=none
# systemctl start nattest-nokill
# diff softshell.iptables <(iptables -t nat -S) | grep '^[<>]'
> -N CNI-170407091e6f9d431e977274
> -A POSTROUTING -s 10.88.0.72/32 -m comment --comment "name: \"podman\" id: \"72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f\"" -j CNI-170407091e6f9d431e977274
> -A CNI-170407091e6f9d431e977274 -d 10.88.0.0/16 -m comment --comment "name: \"podman\" id: \"72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f\"" -j ACCEPT
> -A CNI-170407091e6f9d431e977274 ! -d 224.0.0.0/4 -m comment --comment "name: \"podman\" id: \"72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f\"" -j MASQUERADE
# systemctl stop nattest-nokill
# diff softshell.iptables <(iptables -t nat -S) | grep '^[<>]'
# journalctl -u nattest-nokill
Sep 01 07:09:18 systemd[1]: /etc/systemd/system/nattest-nokill.service:7: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
Sep 01 07:09:18 systemd[1]: Started Podman NAT Test.
Sep 01 07:09:18 podman[50733]: 2021-09-01 07:09:18.921899269 +0200 CEST m=+0.109981145 image pull  busybox
Sep 01 07:09:19 podman[50733]: 2021-09-01 07:09:19.169311311 +0200 CEST m=+0.357393275 container create 72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 07:09:19 podman[50733]: 2021-09-01 07:09:19.690867177 +0200 CEST m=+0.878949045 container init 72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 07:09:19 podman[50733]: 2021-09-01 07:09:19.740818589 +0200 CEST m=+0.928900457 container start 72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 07:09:19 podman[50733]: 2021-09-01 07:09:19.741177664 +0200 CEST m=+0.929259528 container attach 72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 07:09:29 systemd[1]: Stopping Podman NAT Test...
Sep 01 07:09:29 podman[50733]: punt!
Sep 01 07:09:39 podman[50851]: 2021-09-01 07:09:39.754371376 +0200 CEST m=+10.316274069 container stop 72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 07:09:39 podman[50851]: 2021-09-01 07:09:39.828801293 +0200 CEST m=+10.390704033 container died 72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f (image=docker.io/library/busybox:latest, name=nattest)
Sep 01 07:09:39 podman[50851]: nattest
Sep 01 07:09:39 systemd[1]: nattest-nokill.service: Deactivated successfully.
Sep 01 07:09:39 systemd[1]: nattest-nokill.service: Unit process 50733 (podman) remains running after unit stopped.
Sep 01 07:09:39 systemd[1]: nattest-nokill.service: Unit process 50839 (conmon) remains running after unit stopped.
Sep 01 07:09:39 systemd[1]: nattest-nokill.service: Unit process 50907 (podman) remains running after unit stopped.
Sep 01 07:09:39 systemd[1]: nattest-nokill.service: Unit process 50919 (firewall) remains running after unit stopped.
Sep 01 07:09:39 systemd[1]: nattest-nokill.service: Unit process 50924 (iptables) remains running after unit stopped.
Sep 01 07:09:39 systemd[1]: nattest-nokill.service: Unit process 50925 (bridge) remains running after unit stopped.
Sep 01 07:09:39 systemd[1]: Stopped Podman NAT Test.
Sep 01 07:09:40 podman[50733]: 2021-09-01 07:09:40.502850998 +0200 CEST m=+21.690932941 container remove 72222b2a1ef3da3e7195d7e12d5fa30581961f1d92758906ede4503dced4f61f (image=docker.io/library/busybox:latest, name=nattest)

Here systemd itself warns about remaining processes and the message about container removal appears only after the service has been stopped.

Output of podman version:

Version:      3.3.0
API Version:  3.3.0
Go Version:   go1.16.6
Built:        Fri Aug 20 21:36:14 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.22.3
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.29-2.fc34.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: '
  cpus: 1
  distribution:
    distribution: fedora
    version: "34"
  eventLogger: journald
  hostname: <redacted>
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.13.12-200.fc34.x86_64
  linkmode: dynamic
  memFree: 490217472
  memTotal: 2061930496
  ociRuntime:
    name: crun
    package: crun-0.21-1.fc34.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.21
      commit: c4c3cdf2ce408ed44a9e027c618473e6485c635b
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc34.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 0
  swapTotal: 0
  uptime: 32h 51m 47.87s (Approximately 1.33 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 2
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 28
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.3.0
  Built: 1629488174
  BuiltTime: Fri Aug 20 21:36:14 2021
  GitCommit: ""
  GoVersion: go1.16.6
  OsArch: linux/amd64
  Version: 3.3.0

Package info (e.g. output of rpm -q podman or apt list podman):

# rpm -q podman{,-plugins} container{s-common,networking-plugins} crun
podman-3.3.0-1.fc34.x86_64
podman-plugins-3.3.0-1.fc34.x86_64
containers-common-1-21.fc34.noarch
containernetworking-plugins-1.0.0-1.fc34.x86_64
crun-0.21-1.fc34.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes. Latest Fedora release with latest package from its repository.

Additional environment details (AWS, VirtualBox, physical, etc.):

Virtual machine on VMware ESXi. OS image from https://download.fedoraproject.org/pub/fedora/linux/releases/34/Cloud/x86_64/images/Fedora-Cloud-Base-34-1.2.x86_64.raw.xz .

# cat /etc/os-release 
NAME=Fedora
VERSION="34 (Cloud Edition)"
ID=fedora
VERSION_ID=34
VERSION_CODENAME=""
PLATFORM_ID="platform:f34"
PRETTY_NAME="Fedora 34 (Cloud Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:34"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f34/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=34
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=34
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Cloud Edition"
VARIANT_ID=cloud

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19 (11 by maintainers)

Commits related to this issue

Most upvoted comments

Why did we decide that podman stop shouldn’t wait for removal? This sounds like a sane thing to do to prevent race conditions.