podman: Exited pod can not be started after a while
Discussed in https://github.com/containers/podman/discussions/12252
<div type='discussions-op-text'>Originally posted by dispensable November 10, 2021
Hi, I am currently using podman pod manage my dev environment. After dev done, i am used to podman pod stop my pod for less resource consume. But after a while (maybe 3 or 4 days ?), I wanna resume my dev env from exited pod with podman pod start all i got was:
Error: error starting container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd: open `/proc/549277/ns/net`: No such file or directory: OCI runtime attempted to invoke a command that was not found
start with debug log level:
DEBU[0000] Created OCI spec for container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd at /home/test/.local/share/containers/storage/overlay-containers/13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd/userdata/config.json
...
...
INFO[0000] Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for cpu: mkdir /sys/fs/cgroup/cpu/conmon: permission denied
[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied
DEBU[0000] Received: -1
DEBU[0000] Cleaning up container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd
DEBU[0000] unmounted container "13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd"
Error: error starting container 13e8aeb3a36c5f6a0f89765ce9bffb22081712867e50ee138cfd3b73f00878cd: open `/proc/549277/ns/net`: No such file or directory: OCI runtime attempted to invoke a command that was not found
In the json file, linux namespaces are using /proc/549277/ns/xx, which is not exists in the system folder.
Looks like the user namespace has been “garbage collected” (have no idea what it is) by system after pod exited for a while ? If so, how to configure the podman or system not gc my exited pod ? Shoud I use podman pod pause for this situation ?
BTW, I am using rootless podman 3.4.0 with kernal 5.10.27. network --network=slirp4netns:allow_host_loopback=true
Thank you for your time, any advice will be appreciated.</div>
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 15 (5 by maintainers)
Can you get a
podman pod inspecton a pod that reproduces (and, ideally, apodman ps -awhile there’s a pod that won’t restart)? I have a suspicion something is killing the pause container and its associated conmon, and that’s causing us to believe it’s still running and try and reuse its namespaces.I mostly wanted the issue so the reporter would have to fill out the issue template (
podman info,podman version, environment details) 😄