podman: podman-in-podman: Error: timed out waiting for file: internal libpod error

Fedora CI is failing with:

Error: timed out waiting for file /var/lib/containers/storage/overlay-containers/SHA1/userdata/SHA2/exit/SHA1: internal libpod error

Miro Vadkerti pinged me about it last week but has not yet filed an issue because it’s hard to reproduce. All I know is, this is podman-in-podman.

  2022-02-08 18:06:06 mvadkert    esm: it happens in a quite a lot of runs in CI, I will try to get a 
  reproducer tmrw
  2022-02-08 18:06:13 mvadkert    esm: our host runs
  2022-02-08 18:07:39 mvadkert    (me looking it up)
  2022-02-08 18:09:02 mvadkert    podman-3.4.4-1.fc35.x86_64
  2022-02-08 18:09:22 esm mvadkert, which version of crun?
  2022-02-08 18:09:25 esm https://github.com/containers/podman/issues/12262
  2022-02-08 18:09:27 mvadkert    and in container we we have podman-3.4.1-3.module_el8.6.0+954+963caf
  36.x86_64
  2022-02-08 18:09:38 mvadkert    crun-1.4-1.fc35.x86_64
  2022-02-08 18:09:41 esm oh no, this is podman-in-podman??
  2022-02-08 18:09:47 mvadkert    esm: yeah
  2022-02-08 18:09:51 mvadkert    esm: crun-1.2-1.module_el8.6.0+954+963caf36.x86_64
  2022-02-08 18:09:57 mvadkert    fc35 is the main host
  2022-02-08 18:10:06 mvadkert    in it we rune centos8 image
  2022-02-08 18:10:09 mvadkert    with podman
  2022-02-08 18:10:24 mvadkert    image is based on podman stable, but it is build on centos 8 stream 
  2022-02-08 18:10:38 mvadkert    esm: it is rootless podman in privileged podman
  2022-02-08 18:10:44 mvadkert    esm: but the problem seems random
  2022-02-08 18:11:10 mvadkert    esm: or test tools runs podman start and then podman exec to run ins
  ide the container commands
  2022-02-08 18:11:17 mvadkert    esm: anwyay, I will try to find some reproducer
  2022-02-08 18:11:24 mvadkert    it is quite common latel y :(
  2022-02-08 18:12:09 esm check conmon maybe?
  2022-02-08 18:12:28 mvadkert    aconmon-2.0.30-2.fc35.x86_64
  2022-02-08 18:12:36 ✭ esm doesn't even want to think about what time it is in cz
  2022-02-08 18:12:37 mvadkert    conmon-2.0.30-1.module_el8.6.0+944+d413f95e.x86_64
  2022-02-08 18:12:46 mvadkert    esm: yeah, I am not really here :)
  2022-02-08 18:12:48 esm oh, okay, that looks good
  2022-02-08 18:12:57 mvadkert    esm: will try to find some reproducer and I guess file it :)
  2022-02-08 18:13:10 mvadkert    esm: the erorrs are hidden a bit inside test failures
  2022-02-08 18:13:18 mvadkert    esm: so we did not spot it early enough

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 16 (5 by maintainers)

Commits related to this issue

Most upvoted comments

I’ve managed to consistently reproduce the issue by writing a lengthy stream of data (think in gigabytes) via dd if=/dev/random count=4294967296 to stdout. Note that I’m also not running podman-in-podman setup, but rather an ubuntu server vm on windows host via vmware.

dragas@dvm:~$ podman info
host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: unknown'
  cpus: 2
  distribution:
    codename: jammy
    distribution: ubuntu
    version: "22.04"
  eventLogger: journald
  hostname: dvm
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.15.0-53-generic
  linkmode: dynamic
  logDriver: journald
  memFree: 1426100224
  memTotal: 4078845952
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SE
TPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.6.1
  swapFree: 3516313600
  swapTotal: 4080005120
  uptime: 24h 43m 52.59s (Approximately 1.00 days)
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/dragas/.config/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 3
    stopped: 2
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/dragas/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 44
  runRoot: /run/user/1000/containers
  volumePath: /home/dragas/.local/share/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.17.3
  OsArch: linux/amd64
  Version: 3.4.4

I’ve noticed that if you don’t overwhelm the output, podman seems to deal with lengthy streams just fine.