podman: [Bug]: OCI permission denied with rootful podman on OpenShift

Issue Description

When following the guide for running Podman on Kubernetes I run into the following error when trying to run containers:

Error: crun: mount proc to /proc: Permission denied: OCI permission denied

It works if I run this in a kata container via OpenShift Sandboxed Containers.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Set up a new OpenShift dev cluster with CodeReady Containers
  2. Deploy the Rootful Podman without the privileged flag example from https://www.redhat.com/sysadmin/podman-inside-kubernetes
  3. Try to run any image with Podman in the no-priv-rootful container

Describe the results you received

Podman pulls the image but fails to run it with the following error:

Error: crun: mount proc to /proc: Permission denied: OCI permission denied

Describe the results you expected

The image should run

podman info output

host:
  arch: amd64
  buildahVersion: 1.28.0
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - hugetlb
  - pids
  - rdma
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.5-1.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.5, commit: '
  cpuUtilization:
    idlePercent: 84.46
    systemPercent: 4.93
    userPercent: 10.62
  cpus: 6
  distribution:
    distribution: fedora
    variant: container
    version: "37"
  eventLogger: file
  hostname: podman-no-priv-rootful
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-372.32.1.el8_6.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 265007104
  memTotal: 16797020160
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.7.2-3.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.7.2
      commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_AUDIT_WRITE,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_MKNOD,CAP_NET_BIND_SERVICE,CAP_NET_RAW,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 42m 55.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.imagestore: /var/lib/shared
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.9-6.fc37.x86_64
      Version: |-
        fusermount3 version: 3.10.5
        fuse-overlayfs: version 1.9
        FUSE library version 3.10.5
        using FUSE kernel interface version 7.31
    overlay.mountopt: nodev,fsync=0
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 42401247232
  graphRootUsed: 26759946240
  graphStatus:
    Backing Filesystem: overlayfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 2
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 1668178887
  BuiltTime: Fri Nov 11 15:01:27 2022
  GitCommit: ""
  GoVersion: go1.19.2
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.1

Podman in a container

Yes

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 26 (11 by maintainers)

Most upvoted comments

Sorry for the late answer, but you can set the type in the seLinuxOptions field in the container’s security context:

---
apiVersion: v1
kind: Pod
metadata:
  name: podman
spec:
  containers:
    - name: podman
      image: quay.io/podman/stable
      imagePullPolicy: Always
      command: ["sleep", "infinity"]
      securityContext:
        capabilities:
          drop:
            - ALL
          add:
            - SYS_CHROOT
            - SETFCAP
            - SETUID
            - SETGID
        seLinuxOptions:
          type: spc_t

You will also need an SCC that allows that type

We’ve reached about the same conclusion, though we plan on using the Security Profile Operator to create custom SELinux policies for our pods.

Oh, didn’t know about that one, thanks.

Until we get that set up we’re running our pods with the spc_t label.

How do you define that the container should use the spc_t label, cant really find it anywhere ? I noticed that there was this option in the SCC spec,

seLinuxContext:
  seLinuxOptions:
    type: xx

which would allow us to set a specific label on the container which is run under that SCC. Thats great because it means that we can create SELinux rules specific to our specific container label, and that should be fine I think.

Still not sure about why the anyuid-SCC is allowed to execute “podman run” inside the container, but without it I get the Error: copying system image from manifest list: writing blob: adding layer with blob "sha256:245acfe18b55af92043342552833df42827c0449360e14733fde702121a03583": processing tar file(creating mount namespace before pivot: function not implemented): exit status 1 error.

Nontheless, this is great progress I think.

@patchon We seem to be in about the same boat.

We’ve reached about the same conclusion, though we plan on using the Security Profile Operator to create custom SELinux policies for our pods.

Until we get that set up we’re running our pods with the spc_t label.