podman: rootless podman with systemd doesn't work in jenkins

/kind bug

Description

We’re trying to to get a container running under Jenkins in rootless mode and are failing miserably. This container needs to have systemd running, but it fails to start properly for unknown reasons. This is a simple test build that fails (based on #8965):

podman run --detach --rm --privileged --name test --systemd=true ubi8/ubi-init /sbin/init
podman exec test systemctl status || true
podman stop test

Describe the results you received:

+ podman run --detach --rm --privileged --name test --systemd=true ubi8/ubi-init /sbin/init
e81859143131d2a96121ed521b42dcb8363b85a5885226f7cb58d80f356bb7e2
+ podman exec test systemctl status
Failed to connect to bus: No such file or directory
+ true
+ podman stop test
e81859143131d2a96121ed521b42dcb8363b85a5885226f7cb58d80f356bb7e2

Describe the results you expected:

+ podman run --detach --rm --privileged --name test --systemd=true ubi8/ubi-init /sbin/init
e81859143131d2a96121ed521b42dcb8363b85a5885226f7cb58d80f356bb7e2
+ podman exec test systemctl status
● 180db4886d56
    State: degraded
     Jobs: 0 queued
   Failed: 1 units
    Since: Wed 2021-02-17 12:44:51 UTC; 9s ago
   CGroup: /
           ├─init.scope
           │ ├─ 1 /sbin/init
           │ └─23 systemctl status
           └─system.slice
             ├─systemd-journald.service
             │ └─10 /usr/lib/systemd/systemd-journald
             └─dbus.service
               └─21 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
+ true
+ podman stop test
e81859143131d2a96121ed521b42dcb8363b85a5885226f7cb58d80f356bb7e2

Additional information you deem important (e.g. issue happens only occasionally):

This is on a RHEL 8 system. It’s been reconfigured to use cgroups v2 and we’ve made sure that subuid/subgid are configured for the jenkins user. Jenkins is also configured to run as unconfined_t to avoid SELinux issues.

Running those same commands as a root works fine. As does running them as a different user. Both accessed using ssh.

Using sudo podman in jenkins solves the issue, but we’d rather avoid that if we don’t need root privileges.

#7417 might be related, but I don’t understand the details enough to say for sure.

Output of podman version:

Version:      2.2.1
API Version:  2
Go Version:   go1.14.7
Built:        Mon Feb  8 22:19:06 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.18.0
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.22-3.module+el8.3.1+9857+68fb1526.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.22, commit: a40e3092dbe499ea1d85ab339caea023b74829b9'
  cpus: 8
  distribution:
    distribution: '"rhel"'
    version: "8.3"
  eventLogger: file
  hostname: build.lkpg.cendio.se
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 980
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 984
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
  kernel: 4.18.0-240.15.1.el8_3.x86_64
  linkmode: dynamic
  memFree: 3942256640
  memTotal: 8145571840
  ociRuntime:
    name: crun
    package: crun-0.16-2.module+el8.3.1+9857+68fb1526.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.16
      commit: eb0145e5ad4d8207e84a327248af76663d4e50dd
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /tmp/podman-run-984/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /bin/slirp4netns
    package: slirp4netns-1.1.8-1.module+el8.3.1+9857+68fb1526.x86_64
    version: |-
      slirp4netns version 1.1.8
      commit: d361001f495417b880f20329121e3aa431a8f90f
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.4.3
  swapFree: 0
  swapTotal: 0
  uptime: 1h 10m 15.64s (Approximately 0.04 days)
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /var/lib/jenkins/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /bin/fuse-overlayfs
      Package: fuse-overlayfs-1.3.0-2.module+el8.3.1+9857+68fb1526.x86_64
      Version: |-
        fusermount3 version: 3.2.1
        fuse-overlayfs: version 1.3
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /var/lib/jenkins/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 8
  runRoot: /tmp/run-984/containers
  volumePath: /var/lib/jenkins/.local/share/containers/storage/volumes
version:
  APIVersion: "2"
  Built: 1612819146
  BuiltTime: Mon Feb  8 22:19:06 2021
  GitCommit: ""
  GoVersion: go1.14.7
  OsArch: linux/amd64
  Version: 2.2.1

Package info (e.g. output of rpm -q podman or apt list podman):

podman-2.2.1-7.module+el8.3.1+9857+68fb1526.x86_64
crun-0.16-2.module+el8.3.1+9857+68fb1526.x86_64
runc-1.0.0-70.rc92.module+el8.3.1+9857+68fb1526.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

We have not tested with a newer version, no. The troubleshooting guide didn’t seem to have anything relevant to our case.

Additional environment details (AWS, VirtualBox, physical, etc.):

Virtual machine on a vSphere environment.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 29 (12 by maintainers)

Most upvoted comments

Once you go cgroups V2, you never go back.

Great, thanks!

So with that I seem to have the workarounds needed. To summarize, for those finding this issue later, what you need to do is:

  1. sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"
  2. reboot
  3. sudo systemctl enable-linger jenkins
  4. configure pids_limit=0 under [containers] in /etc/containers/containers.conf (until https://bugzilla.redhat.com/show_bug.cgi?id=1897579 is fixed)