podman: Exited pod can not be started after a while

Discussed in https://github.com/containers/podman/discussions/12252

<div type='discussions-op-text'>

Originally posted by dispensable November 10, 2021 Hi, I am currently using podman pod manage my dev environment. After dev done, i am used to podman pod stop my pod for less resource consume. But after a while (maybe 3 or 4 days ?), I wanna resume my dev env from exited pod with podman pod start all i got was:

Error: error starting container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd: open `/proc/549277/ns/net`: No such file or directory: OCI runtime attempted to invoke a command that was not found

start with debug log level:

DEBU[0000] Created OCI spec for container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd at /home/test/.local/share/containers/storage/overlay-containers/13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd/userdata/config.json
...
...
INFO[0000] Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for cpu: mkdir /sys/fs/cgroup/cpu/conmon: permission denied
[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied
DEBU[0000] Received: -1

DEBU[0000] Cleaning up container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd
DEBU[0000] unmounted container "13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd"
Error: error starting container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd: open `/proc/549277/ns/net`: No such file or directory: OCI runtime attempted to invoke a command that was not found

In the json file, linux namespaces are using /proc/549277/ns/xx, which is not exists in the system folder.

Looks like the user namespace has been “garbage collected” (have no idea what it is) by system after pod exited for a while ? If so, how to configure the podman or system not gc my exited pod ? Shoud I use podman pod pause for this situation ?

BTW, I am using rootless podman 3.4.0 with kernal 5.10.27. network --network=slirp4netns:allow_host_loopback=true

Thank you for your time, any advice will be appreciated.</div>

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Can you get a podman pod inspect on a pod that reproduces (and, ideally, a podman ps -a while there’s a pod that won’t restart)? I have a suspicion something is killing the pause container and its associated conmon, and that’s causing us to believe it’s still running and try and reuse its namespaces.

I mostly wanted the issue so the reporter would have to fill out the issue template (podman info, podman version, environment details) 😄