rancher: Log "error running the jail command: exit status 2" seen when using CRI-O with default permissions

What kind of request is this (question/bug/enhancement/feature request): bug

Steps to reproduce (least amount of steps as possible): Fresh k8s installation Follow the steps on the setup documentation:

  • Install cert-manager via helm chart
  • Install rancher via helm chart
  • Use ‘rancher’ cert creation with cert-manager

Result:

tls-rancher secret not created.

Events in the cattle-system namespace:

99s Warning ErrGetKeyPair issuer/rancher Error getting keypair for CA issuer: secret “tls-rancher” not found 99s Warning ErrInitIssuer issuer/rancher Error initializing issuer: secret “tls-rancher” not found

Logs in rancher pods:

2020/05/20 08:38:26 [INFO] Rancher version v2.4.3 (684884c00) is starting 2020/05/20 08:38:26 [INFO] Rancher arguments {ACMEDomains:[] AddLocal:auto Embedded:false HTTPListenPort:80 HTTPSListenPort:443 K8sMode:auto Debug:false Trace:false NoCACerts:false AuditLogPath:/var/log/auditlog/rancher-api-audit.log AuditLogMaxage:10 AuditLogMaxsize:100 AuditLogMaxbackup:10 AuditLevel:0 Features:} 2020/05/20 08:38:26 [INFO] Listening on /tmp/log.sock 2020/05/20 08:38:26 [INFO] Starting API controllers 2020/05/20 08:38:26 [INFO] Running in clustered mode with ID 10.85.0.52, monitoring endpoint cattle-system/rancher 2020/05/20 08:38:27 [INFO] Starting API controllers 2020/05/20 08:38:27 [INFO] Starting rbac.authorization.k8s.io/v1, Kind=Role controller 2020/05/20 08:38:27 [INFO] Starting rbac.authorization.k8s.io/v1, Kind=ClusterRole controller 2020/05/20 08:38:28 [FATAL] error running the jail command: exit status 2

Other details that may be helpful:

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): stable
  • Installation option (single install/HA): single install

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported):
  • Machine type (cloud/VM/metal) and specifications (CPU/memory):
  • Kubernetes version (use kubectl version):

  • Docker version (use docker version):
(paste the output here)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16

Commits related to this issue

Most upvoted comments

I’ve faced same situation with CRI-O engine but fix above created by @kubealex was not properly working due a misconfiguration on yaml.

Solution need to be like

spec:
      containers:
        - name: rancher
          securityContext:
            capabilities:
              add:
                - MKNOD

There was a securityContext missing just above capabilities.

ultimate solution for me was to add the MKNOD capability to the default capabilities in /etc/crio/crio.conf

default_capabilities = [
        "MKNOD",
        "CHOWN", 
        "DAC_OVERRIDE", 
        "FSETID", 
        "FOWNER", 
        "NET_RAW", 
        "SETGID", 
        "SETUID", 
        "SETPCAP", 
        "NET_BIND_SERVICE", 
        "SYS_CHROOT", 
        "KILL", 
]