podman: [rootless] `podman ps` doesnot show running container even though container processes are still running

information.txt

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue: systemd based containers in rootless mode after some hours/days fails to get displayed with podman ps 1.

Describe the results you received: podman ps doesnot display the running containers

Describe the results you expected: it should display the running containers

Additional information you deem important (e.g. issue happens only occasionally): unable to simulate further as podman start on the same container gives below error Error: unable to start container “containerName”: unable to find user postgres: no matching entries in passwd file

Output of podman version:

Version:            1.6.4
RemoteAPI Version:  1
Go Version:         go1.12.12
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.12.12
  podman version: 1.6.4
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.6-1.module+el8.1.1+5259+bcdd613a.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 6ffbb2ec70dbe5ba56e4bfde946fb04f19dd8bbf'
  Distribution:
    distribution: '"rhel"'
    version: "8.1"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 600000
      size: 1
    - container_id: 1
      host_id: 666666
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 600000
      size: 1
    - container_id: 1
      host_id: 666666
      size: 65536
  MemFree: 2170466304
  MemTotal: 8192004096
  OCIRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module+el8.1.1+5259+bcdd613a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 2097147904
  SwapTotal: 2097147904
  arch: amd64
  cpus: 4
  eventlogger: file
  hostname: hostname
  kernel: 4.18.0-147.5.1.el8_1.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.2-2.git21fdece.module+el8.1.1+5460+3ac089c3.x86_64
    Version: |-
      slirp4netns version 0.4.2+dev
      commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
  uptime: 308h 46m 53.3s (Approximately 12.83 days)
registries:
  blocked:
  - all
  insecure: null
  search:
  - path1
  - path2
store:
  ConfigFile: /home/podman/.config/containers/storage.conf
  ContainerStore:
    number: 3
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-1.module+el8.1.1+5259+bcdd613a.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  GraphRoot: /home/podman/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 5
  RunRoot: /tmp/run-600000
  VolumePath: /home/podman/.local/share/containers/storage/volumes


Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.6.4-2.module+el8.1.1+5363+bf8ff1af.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.): vmware based vm

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 32 (17 by maintainers)

Commits related to this issue

Most upvoted comments

by default files older than 10 days are periodically deleted from /tmp (as configured in /usr/lib/tmpfiles.d/tmp.conf).

/tmp should be really used only for short lived sessions

/tmp/run-600000/libpod/tmp/alive being suddenly removed (by systemd or otherwise) sounds like a very good candidate here. That would make Libpod assume the system restarted and perform a full state refresh.

another possibility is that systemd automatically cleans up files under /tmp and we lose track of the containers.

https://www.freedesktop.org/software/systemd/man/systemd-tmpfiles.html

Can you show the content of /tmp/run-600000 when the container disappears?

Providing the systemd unit used to launch the container would help greatly.

Also, can you try a podman ps --all --sync when this happens and see if the container registers as running