podman: User Podman Services (podman.service/podman.socket) fail within 24 hrs
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
User podman services (podman.socket and podman.service) fail within 24 hours of a system reboot. While user podman containers continue to run, the systemctl log shows both units as failed.
Output from podman.service journal:
Jun 07 22:50:27 local.lan systemd[1234]: Failed to start Podman API Service.
Jun 07 22:50:27 local.lan systemd[1234]: podman.service: Failed to allocate exec_fd pipe: Too many open files
Jun 07 22:50:27 local.lan systemd[1234]: podman.service: Failed to run 'start' task: Too many open files
Jun 07 22:50:27 local.lan systemd[1234]: podman.service: Failed with result 'resources'.
Output from podman.socket journal:
Jun 07 22:50:35 local.lan systemd[1234]: Listening on Podman API Socket.
Jun 07 22:50:36 local.lan systemd[1234]: podman.socket: Trigger limit hit, refusing further activation.
Jun 07 22:50:36 local.lan systemd[1234]: podman.socket: Failed with result 'trigger-limit-hit'.
Both these issues look similar to previously closed issues (https://github.com/containers/podman/issues/6093 and https://github.com/containers/podman/issues/5150) but (unless I’m reading them wrong) fixes for those issues should have been merged a while ago.
Steps to reproduce the issue:
-
Generate a rootless container (I started ‘docker.io/thelounge/thelounge:latest’) and create a corresponding user systemctl unit.
-
Allow to run for 24 hours.
-
Run
systemctl --user status- the system will show as degraded. Ifsystemctl list-units --failedis run, bothpodman.socketandpodman.serviceshow as failed.
Describe the results you received: Podman systemd units failed.
Describe the results you expected: Podman services to continue working normally.
Additional information you deem important (e.g. issue happens only occasionally): Both appear to be online and working at system start.
Output of podman version:
podman version 3.1.0-dev
Output of podman info --debug:
host:
arch: amd64
buildahVersion: 1.19.8
cgroupManager: cgroupfs
cgroupVersion: v1
conmon:
package: conmon-2.0.27-1.module_el8.5.0+733+9bb5dffa.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.27, commit: dc08a6edf03cc2dadfe803eac14b896b44cc4721'
cpus: 4
distribution:
distribution: '"centos"'
version: "8"
eventLogger: file
hostname: local.lan
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 4.18.0-305.3.1.el8.x86_64
linkmode: dynamic
memFree: 13275705344
memTotal: 16480956416
ociRuntime:
name: runc
package: runc-1.0.0-70.rc92.module_el8.5.0+733+9bb5dffa.x86_64
path: /usr/bin/runc
version: 'runc version spec: 1.0.2-dev'
os: linux
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
selinuxEnabled: true
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.1.8-1.module_el8.5.0+733+9bb5dffa.x86_64
version: |-
slirp4netns version 1.1.8
commit: d361001f495417b880f20329121e3aa431a8f90f
libslirp: 4.3.1
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.1
swapFree: 3670011904
swapTotal: 3670011904
uptime: 30h 19m 58.74s (Approximately 1.25 days)
registries:
search:
- registry.access.redhat.com
- registry.redhat.io
- docker.io
store:
configFile: /home/USERNAME/.config/containers/storage.conf
containerStore:
number: 1
paused: 0
running: 1
stopped: 0
graphDriverName: overlay
graphOptions:
overlay.mount_program:
Executable: /usr/bin/fuse-overlayfs
Package: fuse-overlayfs-1.5.0-1.module_el8.5.0+733+9bb5dffa.x86_64
Version: |-
fusermount3 version: 3.2.1
fuse-overlayfs: version 1.5
FUSE library version 3.2.1
using FUSE kernel interface version 7.26
graphRoot: /home/USERNAME/.local/share/containers/storage
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
imageStore:
number: 1
runRoot: /run/user/1000/containers
volumePath: /home/USERNAME/.local/share/containers/storage/volumes
version:
APIVersion: 3.1.0-dev
Built: 1616783523
BuiltTime: Fri Mar 26 11:32:03 2021
GitCommit: ""
GoVersion: go1.16.1
OsArch: linux/amd64
Version: 3.1.0-dev
Package info (e.g. output of rpm -q podman or apt list podman):
podman-3.1.0-0.13.module_el8.5.0+733+9bb5dffa.x86_64
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? Checked trouble shooting guide. While not the latest version, it looks like these issues were fixed in podman 1.9.
Additional environment details (AWS, VirtualBox, physical, etc.): Physical system running Centos Stream 8.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 33 (15 by maintainers)
Ah - I believe @jwhonce is working on FD leaks right now
Since podman 3.4 is released, we believe this is now fixed.
@jwhonce Yep - that suppresses the errors - thank you. Not sure if the errors (or the number of files that are open in the container) are relevant for the service failures, but figured it was worth including.