podman: rootless podman ERRO[0000] error joining network namespace for container

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description To manage containers after the system reboot I am trying setup systemd script for rootless podman containers. podman run command

podman run -d \
-p 9200:9200 \
-p 9300:9300 \
--name paperkraft \
-v $CONTAINER_NAME:/usr/share/cluster/data \
-e "ES_JAVA_OPTS=-Xms2g -Xmx2g" \
-e "path.data=/usr/share/cluster/data" \
-e "http.host=0.0.0.0" \
$CONTAINER_IMAGE

cat /etc/systemd/system/paperkraft.service

[Unit]
Description=paperkraft  podman container
Wants=syslog.service

[Service]
User=podman
Restart=always
ExecStartPre=/usr/bin/podman system migrate
ExecStart=/usr/bin/podman start -a paperkraft
ExecStop=/usr/bin/podman stop -t 10 paperkraft

[Install]
WantedBy=multi-user.target

Steps to reproduce the issue:

  1. Run the container

  2. systemctl daemon-reload

  3. systemctl enable paperkraft.service Created symlink from /etc/systemd/system/multi-user.target.wants/paperkraft.service to /etc/systemd/system/paperkraft.service.

Describe the results you received: After the system reboot container are not started and giving the below output for podman ps or podman ps -a

ERRO[0000] error joining network namespace for container 3164432a60af4b2320b404c44676edd87846b2ac7fcd5cc464a2f94184161ea0: error retrieving network namespace at /tmp/run-600000/netns/cni-1a003ffc-4779-8bd9-273e-19325859e8a1: unknown FS magic on "/tmp/run-600000/netns/cni-1a003ffc-4779-8bd9-273e-19325859e8a1": 58465342
ERRO[0000] unable to get container info: "container 3164432a60af4b2320b404c44676edd87846b2ac7fcd5cc464a2f94184161ea0 is not valid: container has already been removed"

Describe the results you expected: paperkraft container should be running state

Additional information you deem important (e.g. issue happens only occasionally): System reboot Output of podman version:

Version:            1.6.4
RemoteAPI Version:  1
Go Version:         go1.13.4
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.13.4
  podman version: 1.6.4
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.6-1.module+el8.2.0+6368+cf16aa14.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 9adfe850ef954416ea5dd0438d428a60f2139473'
  Distribution:
    distribution: '"rhel"'
    version: "8.2"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 600000
      size: 1
    - container_id: 1
      host_id: 666666
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 600000
      size: 1
    - container_id: 1
      host_id: 666666
      size: 65536
  MemFree: 9740709888
  MemTotal: 16496082944
  OCIRuntime:
    name: runc
    package: runc-1.0.0-65.rc10.module+el8.2.0+6368+cf16aa14.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 4
  eventlogger: file
  hostname: ip-10-198-154-92.cloud.dev.net
  kernel: 4.18.0-193.1.2.el8_2.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.2-3.git21fdece.module+el8.2.0+6368+cf16aa14.x86_64
    Version: |-
      slirp4netns version 0.4.2+dev
      commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
  uptime: 1h 35m 11.67s (Approximately 0.04 days)
registries:
  blocked:
  - all
  insecure: null
  search:
  - test.artifactory.global.com
  - artifactory.global.com
store:
  ConfigFile: /home/podman/.config/containers/storage.conf
  ContainerStore:
    number: 5
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-5.module+el8.2.0+6368+cf16aa14.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  GraphRoot: /home/podman/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 16
  RunRoot: /tmp/run-600000
  VolumePath: /home/podman/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.6.4-11.module+el8.2.0+6368+cf16aa14.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.): AWS & VM

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 36 (12 by maintainers)

Most upvoted comments

Does podman rm --force 470b76724bcff05db55806baeb376b9951e8a4db66412937ed827e3d18677e4a work?

The below command worked for me (running Podman 3.4.2 on WSL 2):

rm -rf /tmp/podman-run-$(id -u)/libpod/tmp

For rootless podman users on WSL2: as far as I can tell this happens after a reboot because podman needs to refresh its state after a reboot, but was unable to detect that the system has rebooted in WSL2.

Solution: rm -rf /tmp/run-$(id -u)/libpod/tmp (podman expects this folder to vanish after a reboot to detect the reboot)

Useful info about this in https://www.redhat.com/sysadmin/sudo-rootless-podman.

Hi,

We are using podman 4.1.1

[awx@exec2 ~]$ podman version
Client:       Podman Engine
Version:      4.1.1
API Version:  4.1.1
Go Version:   go1.17.7
Built:        Mon Jul 11 14:56:53 2022
OS/Arch:      linux/amd64

We are using Ansible Automation Platform, which relies on podman and I am still seeing the issue.

[awx@exec2 ~]$ podman ps
ERRO[0001] Joining network namespace for container 3aee535c82cd67a79a961a907522fdd375b65a5e462a11f532e033a8ac9b40fc: error retrieving network namespace at /run/user/1005/netns/netns-c3bef77c-a537-dafd-3369-cd567111feb8: failed to Statfs "/run/user/1005/netns/netns-c3bef77c-a537-dafd-3369-cd567111feb8": no such file or directory 
ERRO[0001] Joining network namespace for container 9d618926ff78856b7d18ee9c2fa3ab265061bbdf7626a690143d7ef4e9212fe0: error retrieving network namespace at /run/user/1005/netns/netns-6ee00e54-9900-2427-4da2-446c43870aa2: failed to Statfs "/run/user/1005/netns/netns-6ee00e54-9900-2427-4da2-446c43870aa2": no such file or directory 
ERRO[0001] Joining network namespace for container d445368385c2794b73b0c159cb836329149f6e0287d92647e0bc5f60601df52c: error retrieving network namespace at /run/user/1005/netns/netns-98671a72-c2c8-6080-08e9-fec055587df0: failed to Statfs "/run/user/1005/netns/netns-98671a72-c2c8-6080-08e9-fec055587df0": no such file or directory 
ERRO[0001] Joining network namespace for container e3ef7a74997bea6118bc2b2e8accc8ee2ab253ace6ce8b0a47c6c92d5c12f66f: error retrieving network namespace at /run/user/1005/netns/netns-e729af5a-54b1-c136-d833-4fcdf3d89cb8: failed to Statfs "/run/user/1005/netns/netns-e729af5a-54b1-c136-d833-4fcdf3d89cb8": no such file or directory 
CONTAINER ID  IMAGE                                               COMMAND               CREATED         STATUS             PORTS       NAMES
d1294beacfe2  hub/custom-iso-provisioning:latest  ssh-agent sh -c t...  12 minutes ago  Up 12 minutes ago              ansible_runner_106884

I do not see any directory /run/user/1005 in my system… It looks like the files are in /tmp/podman-run-1005/netns/. Also, none of the reported network namespaces can be found in the /tmp/podman-run-1005/netns/.

[awx@exec2 ~]$ ls /tmp/podman-run-1005/netns/
netns-0b35ca2d-be90-db24-7e97-571b1b5c5bb7  netns-9c14dcb0-650c-5b39-b299-059c75af78e1  netns-c7cdd98f-69e3-f087-cf9a-bf9e1fe42329
netns-26f03598-bce9-6bda-a002-76fd718187ce  netns-a3b618bd-a448-d165-2661-a01fff440f57  netns-fb365b3e-8dd9-9165-b27f-eb02e0c871a9

What is the impact of this error? Can it reduce production in our platform? So far, the errors are there but I do not think they are impacting our workloads.

My understanding is there is no systemd/init.d at all; it seems we can run things at startup via the Windows scheduler (example) but that’s not very convenient.

In my case I made do with assuming that /proc/1 being more recent than the temp folder indicated a reboot took place. I added this to my .bashrc:

function refresh_rootless_podman_after_reboot {
  local libpod_tmp="/tmp/run-$(id -u)/libpod/tmp"
  if [ /proc/1 -nt "${libpod_tmp}" ]; then
    rm -rf "${libpod_tmp}" 
  fi
}
refresh_rootless_podman_after_reboot

But a simpler solution could be to include the boot ID in the temp path so we always get a fresh folder after reboot, e.g export XDG_RUNTIME_DIR="/tmp/run-$(id -u)/$(cat /proc/sys/kernel/random/boot_id)"

WSL2 / Alpine:edge / podman version 3.1.0 no systemd, no openrc

podman ps -a
ERRO[0000] error joining network namespace for container 3fd6086ec181198bf270b6a9e4f660cdaee4a991a8461c0c6045c21f8c6dbb79: error retrieving network namespace at /tmp/podman-run-1000/netns/cni-b61c831a-c43e-2a9d-b848-a27f55048da1: unknown FS magic on "/tmp/podman-run-1000/netns/cni-b61c831a-c43e-2a9d-b848-a27f55048da1": ef53
Error: error joining network namespace of container 3fd6086ec181198bf270b6a9e4f660cdaee4a991a8461c0c6045c21f8c6dbb79: error retrieving network namespace at /tmp/podman-run-1000/netns/cni-b61c831a-c43e-2a9d-b848-a27f55048da1: unknown FS magic on "/tmp/podman-run-1000/netns/cni-b61c831a-c43e-2a9d-b848-a27f55048da1": ef53

podman rm --force 3fd6086ec181198bf270b6a9e4f660cdaee4a991a8461c0c6045c21f8c6dbb79
ERRO[0000] error joining network namespace for container 3fd6086ec181198bf270b6a9e4f660cdaee4a991a8461c0c6045c21f8c6dbb79: error retrieving network namespace at /tmp/podman-run-1000/netns/cni-b61c831a-c43e-2a9d-b848-a27f55048da1: unknown FS magic on "/tmp/podman-run-1000/netns/cni-b61c831a-c43e-2a9d-b848-a27f55048da1": ef53
Error: error freeing lock for container 3fd6086ec181198bf270b6a9e4f660cdaee4a991a8461c0c6045c21f8c6dbb79: no such file or directory

How to resolve?

Since WSL2 does not have systemd or mount /tmp with tmpfs, then it breaks the assumption. Do you know if there is a similar way to trigger this behaviour in WSL2? IE Cause certain files to be deleted? Do we need to drop an init script?

I just encountered this issue with podman 4.0.2 on AlmaLinux 8, no WSL.

$ podman ps -a
ERRO[0000] Joining network namespace for container 96f881cd6f14891d79bdac918bd5dea8f9923d8282c6807b433ba189cec9ace2: error retrieving network namespace at /run/user/1001/netns/netns-e293186e-0dd0-326c-dbc9-2b263fcc0d1d: unknown FS magic on "/run/user/1001/netns/netns-e293186e-0dd0-326c-dbc9-2b263fcc0d1d": 1021994 
Error: error joining network namespace of container 96f881cd6f14891d79bdac918bd5dea8f9923d8282c6807b433ba189cec9ace2: error retrieving network namespace at /run/user/1001/netns/netns-e293186e-0dd0-326c-dbc9-2b263fcc0d1d: unknown FS magic on "/run/user/1001/netns/netns-e293186e-0dd0-326c-dbc9-2b263fcc0d1d": 1021994

$ podman rm --force 96f881cd6f14891d79bdac918bd5dea8f9923d8282c6807b433ba189cec9ace2
ERRO[0000] Joining network namespace for container 96f881cd6f14891d79bdac918bd5dea8f9923d8282c6807b433ba189cec9ace2: error retrieving network namespace at /run/user/1001/netns/netns-e293186e-0dd0-326c-dbc9-2b263fcc0d1d: unknown FS magic on "/run/user/1001/netns/netns-e293186e-0dd0-326c-dbc9-2b263fcc0d1d": 1021994 
ERRO[0000] container_linux.go:419: signaling init process caused: operation not permitted 
Error: cannot remove container 96f881cd6f14891d79bdac918bd5dea8f9923d8282c6807b433ba189cec9ace2 as it could not be stopped: error sending SIGKILL to container 96f881cd6f14891d79bdac918bd5dea8f9923d8282c6807b433ba189cec9ace2: operation not permitted

The containers were created by testcontainers over the Podman socket, not sure if that did anything special.

Could you update to podman 4.1 and see if this continues to happen. Please open a new issue rather then adding to a closed issue.