operator: Tasks that Run Rodman (and possibably other container tools) Fail

Expected Behavior

I’d expect Tasks that run podman pods (and podman comands in said pod) would be able to run without any issue. This used to work prior to the below PR:

Actual Behavior

Any podman command that is executed fails and causes the task to fail

+ podman pull image-registry.openshift-image-registry.svc:5000/oco/certified-operator-index:v4.12
time="2023-01-20T17:35:09Z" level=warning msg="\"/\" is not a shared mount, this could cause issues or missing mounts with rootless containers"
time="2023-01-20T17:35:09Z" level=error msg="running `/usr/bin/newuidmap 35 0 1000 1 1 10000 5000`: newuidmap: write to uid_map failed: Operation not permitted\n"
Error: cannot set up namespace using "/usr/bin/newuidmap": exit status 1

Steps to Reproduce the Problem

  1. Install the Latest version of Pipelines Operator on OpenShift (4.12 is where I see this 1.9.0 version, but could be in other versions).
  2. Create a Tekton Task that uses podman pod.
  3. Try to run podman info (or any other podman cmd) and see this fail.

Additional Info

  • Kubernetes version:

    Client Version: 4.12.0
    Kustomize Version: v4.5.7
    Server Version: 4.12.0
    Kubernetes Version: v1.25.4+77bec7a
    
    
  • Tekton Pipeline version:

    {"packageName":"openshift-pipelines-operator-rh","version":"1.9.0"}
    

This is breaking pipelines/tasks that are a part of Red Hat certification process when run on a partners own cluster, so we’d like this to be merged/released on priority.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

@pbaity That is in fact the same issue/error.

The custom scc would look like this:

oc apply -f - <<'EOF'
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    kubernetes.io/description: my-custom-scc is a close replica of pipeline scc with privilege escalation.
    release.openshift.io/create-only: "true"
  name: my-custom-scc
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: true
allowPrivilegedContainer: true
allowedCapabilities:
  - SETFCAP
defaultAddCapabilities: null
fsGroup:
  type: MustRunAs
groups:
  - system:cluster-admins
priority: 10
readOnlyRootFilesystem: false
requiredDropCapabilities:
  - MKNOD
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: MustRunAs
supplementalGroups:
  type: RunAsAny
volumes:
  - configMap
  - downwardAPI
  - emptyDir
  - persistentVolumeClaim
  - projected
  - secret
 EOF

You could then attach it to your SA that runs the pipeline by doing this:

oc adm policy add-scc-to-user my-custom-scc -z the-service-account-that-runs-the-pipeline

This would be the quickest way to accomplish this, there are other ways as well. Hopefully this works for you.

In 4.12 by default workloads get the restricted-v2 scc, which has allowPrivilegeEscalation: false, which would cause the same issue as the pipeline scc that this tekton operator provides and uses to run tasks. To me, this would still not work, unless the PR opened is accepted (which the maintainers said they won’t accept, which is fine), and the only option is that the user of tekton creates a custom SA and SCC to run their workloads.

Yes, but 4.12 also ships with something called userns, where you can actually run something with restricted-v2 but still be root in the container (and a random uid on the node where that container is). This works today but it had a bug where we would appear as root but could do “root” thing inside (like dnf install …, …) and thus you wouldn’t be able to run podman either. Most recent nightly have a fix for this.

But yes, in general, the only option is that the user of tekton creates a custom SA and SCC to run their workloads – as we do not want to enable “privilege escalation” when the operator is installed, it’s the users (or cluster admin) responsability to do that explicitely. TEP-0085 would help making this “bearable” (aka less verbose) for users 👼🏼

Look into the latest 4.12 and see if the recent fixes helps us cover ~90% of use cases with the default SA and restricted SCC

I am preparing something on this (a doc/blog/…)

@acornett21 Thanks! That worked for me, and saved a lot of time.

I said 'shouldn’t not ‘can’t’, there are cases where root is the only option, but this isn’t one tekton should dictate IMO. Also all of the documents for pipelines-scc point to it being a copy of anyuid SCC which also has RunAsAny.

@ArthurVardevanyan Your first example is more secure, you shouldn’t run containers as root user, especially for applications that are running on production clusters.