sysbox: `procfd: operation not permitted` when running a Pod with `sysbox-runc`
I’ve installed Sysbox on a AKS following the instructions using the Sysbox daemonset (here). The error I am seeing is
create failed: time="2023-01-02T12:28:18Z" level=error msg="container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: rootfs_linux.go:66: setting up rootfs mounts caused: rootfs_linux.go:1156: mounting \"sysfs\" to rootfs \"/var/lib/sysbox/shiftfs/f24948fc-9f27-43bd-8d8f-56947b850b7a\" at \"/sys\" caused: mount through procfd: operation not permitted"
The system info for the node is
│ System Info:
│ Machine ID: 20a5246312f9429094874ca4e41dbb97
│ System UUID: 91c263d5-db43-0946-aa45-e560c34470ac
│ Boot ID: 72d05dd9-343d-4743-852c-59476bb8da42
│ Kernel Version: 5.4.0-1085-azure
│ OS Image: Ubuntu 18.04.6 LTS
│ Operating System: linux
│ Architecture: amd64
│ Container Runtime Version: cri-o://1.22.4
│ Kubelet Version: v1.22.11
│ Kube-Proxy Version: v1.22.11
Here’s the Pod state:
Pod
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2023-01-02T12:22:13Z"
name: coder-niklasrosenstein-sysbox-test
namespace: coder
resourceVersion: "66569083"
uid: 45132ce5-4b13-4766-9316-5ea47baf5eb5
spec:
automountServiceAccountToken: true
containers:
- command:
- sh
- -c
- "#!/usr/bin/env sh\nset -eux\n# Sleep for a good long while before exiting.\n#
This is to allow folks to exec into a failed workspace and poke around to\n#
troubleshoot.\nwaitonexit() {\n\techo \"=== Agent script exited with non-zero
code. Sleeping 24h to preserve logs...\"\n\tsleep 86400\n}\ntrap waitonexit
EXIT\nBINARY_DIR=$(mktemp -d -t coder.XXXXXX)\nBINARY_NAME=coder\nBINARY_URL=https://coder-dev.helsing-dev.ai/bin/coder-linux-amd64\ncd
\"$BINARY_DIR\"\n# Attempt to download the coder agent.\n# This could fail for
a number of reasons, many of which are likely transient.\n# So just keep trying!\nwhile
:; do\n\t# Try a number of different download tools, as we don not know what
we\n\t# will have available.\n\tstatus=\"\"\n\tif command -v curl >/dev/null
2>&1; then\n\t\tcurl -fsSL --compressed \"${BINARY_URL}\" -o \"${BINARY_NAME}\"
&& break\n\t\tstatus=$?\n\telif command -v wget >/dev/null 2>&1; then\n\t\twget
-q \"${BINARY_URL}\" -O \"${BINARY_NAME}\" && break\n\t\tstatus=$?\n\telif command
-v busybox >/dev/null 2>&1; then\n\t\tbusybox wget -q \"${BINARY_URL}\" -O \"${BINARY_NAME}\"
&& break\n\t\tstatus=$?\n\telse\n\t\techo \"error: no download tool found, please
install curl, wget or busybox wget\"\n\t\texit 127\n\tfi\n\techo \"error: failed
to download coder agent\"\n\techo \" command returned: ${status}\"\n\techo
\"Trying again in 30 seconds...\"\n\tsleep 30\ndone\n\nif ! chmod +x $BINARY_NAME;
then\n\techo \"Failed to make $BINARY_NAME executable\"\n\texit 1\nfi\n\nexport
CODER_AGENT_AUTH=\"token\"\nexport CODER_AGENT_URL=\"https://coder-dev.helsing-dev.ai/\"\nexec
./$BINARY_NAME agent\n"
env:
- name: CODER_AGENT_TOKEN
value: REDACTED
image: codercom/enterprise-base:ubuntu
imagePullPolicy: IfNotPresent
name: dev
resources: {}
securityContext:
allowPrivilegeEscalation: true
privileged: false
readOnlyRootFilesystem: false
runAsNonRoot: false
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /home/coder
mountPropagation: None
name: home
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-jkncs
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
imagePullSecrets:
- name: regcred
nodeName: aks-default-40604188-vmss000000
nodeSelector:
sysbox-runtime: running
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
runtimeClassName: sysbox-runc
schedulerName: default-scheduler
securityContext:
fsGroup: 1000
runAsNonRoot: false
runAsUser: 1000
serviceAccount: default
serviceAccountName: default
shareProcessNamespace: false
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: home
persistentVolumeClaim:
claimName: coder-niklasrosenstein-sysbox-test-home
- name: kube-api-access-jkncs
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-01-02T12:22:16Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-01-02T12:22:16Z"
message: 'containers with unready status: [dev]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-01-02T12:22:16Z"
message: 'containers with unready status: [dev]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-01-02T12:22:16Z"
status: "True"
type: PodScheduled
containerStatuses:
- image: codercom/enterprise-base:ubuntu
imageID: ""
lastState: {}
name: dev
ready: false
restartCount: 0
started: false
state:
waiting:
message: |
container create failed: time="2023-01-02T12:28:18Z" level=error msg="container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: rootfs_linux.go:66: setting up rootfs mounts caused: rootfs_linux.go:1156: mounting \"sysfs\" to rootfs \"/var/lib/sysbox/shiftfs/f24948fc-9f27-43bd-8d8f-56947b850b7a\" at \"/sys\" caused: mount through procfd: operation not permitted"
reason: CreateContainerError
hostIP: 10.79.129.249
phase: Pending
podIP: 10.79.130.33
podIPs:
- ip: 10.79.130.33
qosClass: BestEffort
startTime: "2023-01-02T12:22:16Z"
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 19 (9 by maintainers)
Thanks @jamonation … yes it’s easy to miss. That annotation will become unnecessary once K8s and containerd formalize support for pods with user-namespaces (soon I believe).
Hi @NiklasRosenstein, thanks for trying Sysbox.
That error typically means the pod spec is missing the
io.kubernetes.cri-o.userns-mode: "auto:size=65536"annotation:Could you double check and let me know?
Thanks!
I found this searching for a similar error creating a pod on GKE, and indeed, somehow I missed adding the annotation to my pod manifest @ctalledo. Adding that
io.kubernetes.cri-o.userns-mode: "auto:size=65536"annotation got it running, did you get things working @NiklasRosenstein?