cri-o: su throws system error in crio started containers

Description I have the same image being started in the same cluster by k8s. One node runs docker and the other runs cri-o. The container started by docker has no issues running su, but the one started by cri-o just says su: system errror This is consistent across the following runc exec <containerID> su docker exec/crictl exec <containerID> su kubectl exec -it <pod> sh and then su

Steps to reproduce the issue:

  1. Deploy this yaml using kubectl on a docker node and a cri-o node
apiVersion: v1
kind: Pod
metadata:
  name: tomcat-pod
  labels:
    name: tomcat
spec:
  containers:
    - image: tomcat:jre8
      name: tomcat-pod
      ports:
        - containerPort: 8080
  1. Try execute su in the respective containers

Describe the results you received: docker deployed container did not cause an error and worked fine. cri-o one caused an error su: system error

Describe the results you expected: crio deployed container should have had the same behavior since both are runc in the end

Additional information you deem important (e.g. issue happens only occasionally):

Output of crio --version:

crio version 1.16.6

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.4", GitCommit:"224be7bdce5a9dd0c2fd0d46b83865648e2fe0ba", GitTreeState:"clean", BuildDate:"2019-12-11T12:37:43Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Output of docker version:

Client: Docker Engine - Community
 Version:           19.03.4
 API version:       1.40
 Go version:        go1.12.10
 Git commit:        9013bf583a
 Built:             Fri Oct 18 15:52:22 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.4
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.10
  Git commit:       9013bf583a
  Built:            Fri Oct 18 15:50:54 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.10
  GitCommit:        b34a5c8af56e510852c35414db4c1f4fa6172339
 runc:
  Version:          1.0.0-rc8+dev
  GitCommit:        3e425f80a8c931f88e6d94a8c831b9d5aa481657
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Additional environment details (AWS, VirtualBox, physical, etc.): All nodes in my cluster are identical (they are virtual machines)

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 21 (13 by maintainers)

Most upvoted comments

found it! It had to do with capabilities. CRI-O ships with fewer capabilities by default than podman does. One such capability is CAP_AUDIT_WRITE. I bet docker also gives containers CAP_AUDIT_WRITE by default. More capabilities==less secure 😄

if you change the pod spec to

apiVersion: v1
kind: Pod
metadata:
  name: tomcat-pod
  labels:
    name: tomcat
spec:
  containers:
    - image: tomcat:jre8
      name: tomcat-pod
      ports:
        - containerPort: 8080
      securityContext:
        capabilities:
          add:
            - AUDIT_WRITE

it works as expected

note: once we get https://github.com/seccomp/containers-golang/pull/27 in, we may prevent the need to give this container audit_write

try the container used here (or debian:stretch, as that seems to be the base):

# podman run --cap-drop audit_write -ti debian:stretch su
su: System error