podman: [Bug]: podman pod create command hangs indefinitely
Issue Description
podman pod create command hangs indefinitely and causes all other podman commands to hang.
Steps to reproduce the issue
- Create a podman network.
podman network create --ipv6
- Create a pod.
podman pod create \
--name miniflux \
--network podman1 \
--replace \
--userns keep-id
Describe the results you received
The podman pod create command hangs indefinitely and any podman commands such as podman ps hang while the podman pod create command is hanging.
Describe the results you expected
I expected podman pod create to finish executing almost immediately but within at most a few minutes.
podman info output
host:
arch: arm64
buildahVersion: 1.28.0
cgroupControllers:
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.5-1.fc37.aarch64
path: /usr/bin/conmon
version: 'conmon version 2.1.5, commit: '
cpuUtilization:
idlePercent: 86.4
systemPercent: 6.25
userPercent: 7.36
cpus: 6
distribution:
distribution: fedora
variant: iot
version: "37"
eventLogger: journald
hostname: rockpro64.jwillikers.io
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 6.1.6-200.fc37.aarch64
linkmode: dynamic
logDriver: journald
memFree: 852598784
memTotal: 3994365952
networkBackend: netavark
ociRuntime:
name: crun
package: crun-1.7.2-3.fc37.aarch64
path: /usr/bin/crun
version: |-
crun version 1.7.2
commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
remoteSocket:
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.0-8.fc37.aarch64
version: |-
slirp4netns version 1.2.0
commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.3
swapFree: 2858676224
swapTotal: 3994021888
uptime: 53h 18m 24.00s (Approximately 2.21 days)
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
- quay.io
store:
configFile: /var/home/jordan/.config/containers/storage.conf
containerStore:
number: 12
paused: 0
running: 11
stopped: 1
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/jordan/.local/share/containers/storage
graphRootAllocated: 123364966400
graphRootUsed: 80927322112
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 15
runRoot: /run/user/1000/containers
volumePath: /home/jordan/.local/share/containers/storage/volumes
version:
APIVersion: 4.3.1
Built: 1668178831
BuiltTime: Fri Nov 11 09:00:31 2022
GitCommit: ""
GoVersion: go1.19.2
Os: linux
OsArch: linux/arm64
Version: 4.3.1
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
No
Additional environment details
RockPro64 / aarch64
Fedora IoT 35 - 37
Additional information
Backing storage for containers is running off of an NFS mounted volume and an S3 mounted volume, both of which are mounted via Linux directly. Several containers and on pod are running on the system, managed via systemd, without problems.
I think this may be related to #10269.
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 31 (23 by maintainers)
Seems that this needs still some more work to prevent this issue to arise.
But it’s for @giuseppe @Luap99 and @mheon to state.
thanks. So the NFS mount is used only for volumes
OK. Probably a lock conflict, then. We added some detection around that (
ErrWillDeadlockgets thrown in some places), but it seems like pod creation and pod removal don’t have that.