podman: Podman hangs, possible deadlock ?

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description Seems that I am suffering from the very same problem as mentioned in https://github.com/containers/podman/issues/1574 ( 4y old ), but with the latest Podman version ( 4.2.0 ).

My scenario is a little bit different. I am running “custom scheduler” in a container with a mounted podman.socket and its running via systemd service. This scheduler is responsible for creating/managing/deleting ( there are also other operations like attaching, streaming logs, stats, etc involved ) and also for waiting for containers on behalf. Seems that when I restart this scheduler, Podman will become unavailable. Also using --userns=auto for containers created via “custom scheduler”.

Steps to reproduce the issue: ( Not relevant due to its own business logic )

  1. systemctl start scheduler.service ( podman run -v “/run/podman/podman.sock:/run/podman/podman.sock” myscheduler )

  2. systemctl restart scheduler.service

Describe the results you received: Podman is not accessible via CLI and API

Describe the results you expected: Podman API is accessible via CLI and API

Additional information you deem important (e.g. issue happens only occasionally):

Related logs: journald -u podman --no-pager | grep 04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944

...
Oct 05 16:50:32 ip-10-2-22-82 podman[5526]: @ - - [05/Oct/2022:16:50:27 +0000] "POST /containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/start HTTP/1.1" 204 0 "" ""
Oct 05 16:50:37 ip-10-2-22-82 podman[5526]: @ - - [05/Oct/2022:16:50:32 +0000] "GET /containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/json HTTP/1.1" 200 9033 "" ""
Oct 05 17:00:32 ip-10-2-22-82 podman[5526]: @ - - [05/Oct/2022:16:50:22 +0000] "POST /containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/attach?stream=true&stdout=true&stderr=true HTTP/1.1" 200 0 "" ""
Oct 05 17:00:32 ip-10-2-22-82 podman[5526]: 2022-10-05 17:00:32.984086144 +0000 UTC m=+1448.094378702 container died 04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944 (image=XXXX.dkr.ecr.us-east-1.amazonaws.com/XXXX:0NkyJzxqJQWVq2aet, name=lucid_mclaren, health_status=)
Oct 05 17:00:52 ip-10-2-22-82 podman[5526]: @ - - [05/Oct/2022:16:50:32 +0000] "POST /containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/wait HTTP/1.1" 200 30 "" "
...

ps -ef | grep 04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944

...
root       13829       1  0 16:50 ?        00:00:00 /usr/bin/conmon --api-version 1 -c 04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944 -u 04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944 -r /usr/bin/crun -b /var/lib/containers/storage/overlay-containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/userdata -p /run/containers/storage/overlay-containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/userdata/pidfile -n lucid_mclaren --exit-dir /run/libpod/exits --full-attach -s -l k8s-file:/var/lib/containers/storage/overlay-containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/userdata/ctr.log --log-level info --log-size-max 10485760 --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/containers/storage/overlay-containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/userdata/oci-log --conmon-pidfile /run/containers/storage/overlay-containers/04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /run/containers/storage --exit-command-arg --log-level --exit-command-arg info --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/libpod --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /var/lib/containers/storage/volumes --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev,metacopy=on --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944
root       44396   13829  0 17:00 ?        00:00:00 /usr/bin/podman --root /var/lib/containers/storage --runroot /run/containers/storage --log-level info --cgroup-manager systemd --tmpdir /run/libpod --network-config-dir  --network-backend netavark --volumepath /var/lib/containers/storage/volumes --runtime crun --storage-driver overlay --storage-opt overlay.mountopt=nodev,metacopy=on --events-backend journald container cleanup 04b5091d21ad06885e48915d9539997c1b58745a52e3178ce12995cf6b82f944
...

lslocks | grep 44396

...
podman   5482 POSIX      WRITE  0     0   0 /var/lib/containers/storage/userns.lock
podman  44396 POSIX  64B WRITE* 0     0   0 /var/lib/containers/storage/storage.lock
podman   5482 POSIX  64B WRITE  0     0   0 /var/lib/containers/storage/storage.lock
podman   5482 POSIX  64B WRITE  0     0   0 /var/lib/containers/storage/overlay-layers/layers.lock
podman   5482 POSIX  64B WRITE  0     0   0 /var/lib/containers/storage/overlay-images/images.lock
...

ps -ef | grep 5482

root        5482       1 81 19:09 ?        00:17:37 /usr/bin/podman --log-level=info system service

strace podman ps

...
futex(0x56081b801868, FUTEX_WAIT_PRIVATE, 0, NULL

Output of podman version:

Client:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.4
Built:        Thu Aug 11 14:42:17 2022
OS/Arch:      linux/amd64

Output of podman info:

host:
  arch: amd64
  buildahVersion: 1.27.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpuUtilization:
    idlePercent: 92.99
    systemPercent: 1.91
    userPercent: 5.09
  cpus: 16
  distribution:
    distribution: fedora
    variant: coreos
    version: "36"
  eventLogger: journald
  hostname: ip-10-2-22-82
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.18.18-200.fc36.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 123904901120
  memTotal: 133529747456
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.5-1.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.5
      commit: 54ebb8ca8bf7e6ddae2eb919f5b82d1d96863dea
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 1h 25m 45.00s (Approximately 0.04 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 34
    paused: 0
    running: 0
    stopped: 34
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 1098974756864
  graphRootUsed: 12981538816
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 11
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.2.0
  Built: 1660228937
  BuiltTime: Thu Aug 11 14:42:17 2022
  GitCommit: ""
  GoVersion: go1.18.4
  Os: linux
  OsArch: linux/amd64
  Version: 4.2.0

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

podman-4.2.0-2.fc36.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.): AWS EC2 instance, /var/lib/containers/storage/ mounted from EBS running Fedora CoreOS 36.20220820.3.0

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 22 (7 by maintainers)

Most upvoted comments

… and I’m pretty sure @mtrmac fixed it in https://github.com/containers/storage/pull/1387.

Confirmed: it’s a locking bug. In goroutine 57243, the getMaxSizeFromImage() function calls LayerStore() before iterating through the list of known layers, and LayerStore() attempts to lock the layers.json file while reading it. The problem is that getMaxSizeFromImage() is being called from getAutoUserNS(), which is being called from CreateContainer(), which is already holding a write lock on that very file.

getMaxSizeFromImage trying to acquire an already held layers.lock could happen, if storage.lock triggered a reload.

In this log, in goroutine 57243 , isn’t the blocking lock 0xc00011ecd0 store.graphLock = storage.lock, not layers.lock?

Either way, that getMaxSizeFromImageLayerStore call is a problem, because it nests storage.lock inside layers.lock (held by CreateContainer, while the rest of store nests layers.lock inside storage.lock. In this case, it could be goroutine 62699 , which holds store.lock but blocks on layers.lock.

It does seem that https://github.com/containers/storage/pull/1387 fixes this … without even noticing that it was broken. Oops. Does anyone have suggestions for https://github.com/containers/storage/issues/1389 ?

I’ll, at least, update https://github.com/containers/storage/pull/1438 to document the graphLock vs C/I/L lock ordering rules.