podman: pod stats: unknown FS magic on "/run/user/4902/netns/netns-etc-etc"
Seen in CI, f37 rootless sqlite:
podman pod stats on a specific running pod
...
$ podman [options] run --http-proxy=false --pod cec7af2e2f8e0479a91d94bff88cf8482c2907f7644eb7c90a9ea01b2f13ff22 -d quay.io/libpod/alpine:latest top
eed6053f6f0e8a2f883464f994bb30d09304274e1b1e1cdb2eb52a0f49ae3985
$ podman [options] pod stats --no-stream cec7af2e2f8e0479a91d94bff88cf8482c2907f7644eb7c90a9ea01b2f13ff22
Error: unknown FS magic on "/run/user/4902/netns/netns-a2804d97-802b-9e57-2e40-11c1206c102c": 1021994
[AfterEach] Podman pod stats
/var/tmp/go/src/github.com[/containers/podman/test/e2e/pod_stats_test.go:33](https://github.com/containers/podman/blob/956677a741cdcce627dda4336f85c8fc0be83a5c/test/e2e/pod_stats_test.go#L33)
$ podman [options] pod rm -fa -t 0
time="2023-03-23T15:11:45-05:00" level=error msg="Unable to clean up network for container 652344ca5642edd24b93f810fcb9ebbb1c0969195a67454bc05b8a67e7ed3185: \"unmounting network namespace for container 652344ca5642edd24b93f810fcb9ebbb1c0969195a67454bc05b8a67e7ed3185: failed to unmount NS: at /run/user/4902/netns/netns-a2804d97-802b-9e57-2e40-11c1206c102c: invalid argument\""
cec7af2e2f8e0479a91d94bff88cf8482c2907f7644eb7c90a9ea01b2f13ff22
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 25 (16 by maintainers)
Commits related to this issue
- pkg/netns: NewNS() check if file name already exists The code picks a random name but never checked if this name was already in use. If the RNG creates generates the same name again we will end up wi... — committed to Luap99/common by Luap99 a year ago
- pkg/netns: NewNS() check if file name already exists The code picks a random name but never checked if this name was already in use. If the RNG creates generates the same name again we will end up wi... — committed to Luap99/common by Luap99 a year ago
- pkg/netns: NewNS() check if file name already exists The code picks a random name but never checked if this name was already in use. If the RNG generates the same name again we will end up with conta... — committed to Luap99/common by Luap99 a year ago
- pkg/netns: NewNS() check if file name already exists The code picks a random name but never checked if this name was already in use. If the RNG generates the same name again we will end up with conta... — committed to Luap99/common by Luap99 a year ago
- pkg/netns: NewNS() check if file name already exists The code picks a random name but never checked if this name was already in use. If the RNG generates the same name again we will end up with conta... — committed to Luap99/common by Luap99 a year ago
- pkg/netns: NewNS() check if file name already exists The code picks a random name but never checked if this name was already in use. If the RNG generates the same name again we will end up with conta... — committed to Luap99/common by Luap99 a year ago
- pkg/netns: NewNS() check if file name already exists The code picks a random name but never checked if this name was already in use. If the RNG generates the same name again we will end up with conta... — committed to Luap99/common by Luap99 a year ago
- rootless netns: recover from invalid netns I made a change in c/common[1] to prevent duplicates in netns names. This now causes problem in podman[2] where the rooless netns will no longer work after ... — committed to Luap99/libpod by Luap99 a year ago
- rootless netns: recover from invalid netns I made a change in c/common[1] to prevent duplicates in netns names. This now causes problem in podman[2] where the rootless netns will no longer work after... — committed to Luap99/libpod by Luap99 a year ago
- rootless netns: recover from invalid netns I made a change in c/common[1] to prevent duplicates in netns names. This now causes problem in podman[2] where the rootless netns will no longer work after... — committed to Luap99/libpod by Luap99 a year ago
- rootless netns: recover from invalid netns I made a change in c/common[1] to prevent duplicates in netns names. This now causes problem in podman[2] where the rootless netns will no longer work after... — committed to Luap99/libpod by Luap99 a year ago
- pkg/rootless: do not use shortcut with --tmpdir When using --tmpdir for the podman cli we use this as location for the pause.pid file. However the c shortcut code has no idea about this option and al... — committed to Luap99/libpod by Luap99 a year ago
- pkg/rootless: do not use shortcut with --tmpdir When using --tmpdir for the podman cli we use this as location for the pause.pid file. However the c shortcut code has no idea about this option and al... — committed to Luap99/libpod by Luap99 a year ago
- test/e2e: run system reset test serial USe the new ginkgo `Serial` decorator to make sure system reset is never executed in parallel. system reset stops teh rootless pause process which causes major ... — committed to Luap99/libpod by Luap99 a year ago
- test/e2e: run system reset test serial USe the new ginkgo `Serial` decorator to make sure system reset is never executed in parallel. system reset stops teh rootless pause process which causes major ... — committed to Luap99/libpod by Luap99 a year ago
- test/e2e: run system reset test serial USe the new ginkgo `Serial` decorator to make sure system reset is never executed in parallel. system reset stops teh rootless pause process which causes major ... — committed to Luap99/libpod by Luap99 a year ago
- test/e2e: run system reset test serial USe the new ginkgo `Serial` decorator to make sure system reset is never executed in parallel. system reset stops teh rootless pause process which causes major ... — committed to Luap99/libpod by Luap99 a year ago
- test/e2e: run system reset test serial USe the new ginkgo `Serial` decorator to make sure system reset is never executed in parallel. system reset stops teh rootless pause process which causes major ... — committed to Luap99/libpod by Luap99 a year ago
- test/e2e: run system reset test serial USe the new ginkgo `Serial` decorator to make sure system reset is never executed in parallel. system reset stops teh rootless pause process which causes major ... — committed to Luap99/libpod by Luap99 a year ago
- test/e2e: run system reset test serial USe the new ginkgo `Serial` decorator to make sure system reset is never executed in parallel. system reset stops teh rootless pause process which causes major ... — committed to Luap99/libpod by Luap99 a year ago
Ok,
--this --thatwas a good hint: I now understand were it goes wrong, here is a reproducer:First make sure you have no podman processes or containers running running, then kill the pause process to start from a clean system:
The bug here is that there is a shortcut in pkg/rootless/rootless_linux.c which is always run before any go code is run (including the option parsing). The c code just sees $XDG_RUNTIME_DIR/libpod/tmp/pause.pid and imminently joins this namespace from this process. This shortcut is only there to join, so if the process does not exits it will do nothing and let podman handle the namespace and pause process creation.
So that is why you have to run the first time with
--tmpdirbefore the pause process p pid existed at$XDG_RUNTIME_DIR/libpod/tmp/pause.pid. If you do it the other way around even the process with--tmpdirit would have joined the namespace via the shortcut so it would use the same one and not cause issues. See the possibility for flakes here?The fix for this is of course to not do the shortcut when we see
--tmpdirso the podman go code can handle it, Note that we already special case other commands: https://github.com/containers/podman/blob/ac1d297fc76f4423d6f44b98c864476cbeffce86/pkg/rootless/rootless_linux.c#L378-L384