sysbox: Fails when docker root dir is mapped to a volume (e.g `emptyDir`)

In my Jenkins setup, I use the Kubernetes Plugin to allow it to spawn a new Pod for each build. Now, I’m integrating it with Sysbox, but I found this issue.

Some notes:

  1. I cannot reproduce the issue with either docker:dind or the registry.nestybox.com/nestybox/ubuntu-bionic-systemd-docker
  2. Without the emptyDir volume the error does not happen
  3. It does not fail to pull other images like ubuntu
  4. Without sysbox, with privileged it works fine
  5. On my image, /home/jenkins/agent/docker is the docker root dir rather than /var/lib/docker

I know there must be something wrong with my image and I will probably refactor my image to make use of systemd as the sysbox sample image does, but I would like to report it here in case someone else faces the same issue.

To reproduce:

$ kubectl run dind --rm -i --image ghcr.io/felipecrs/jenkins-agent-dind:latest --pod-running-timeout=3m --overrides='
{
  "metadata": {
    "annotations": {
      "io.kubernetes.cri-o.userns-mode": "auto:size=65536"
    }
  },
  "spec": {
    "containers": [
      {
        "image": "ghcr.io/felipecrs/jenkins-agent-dind:latest",
        "name": "dind",
        "imagePullPolicy": "Always",
        "tty": true,
        "volumeMounts": [
          {
            "mountPath": "/home/jenkins/agent",
            "name": "workspace-volume",
            "readOnly": false
          }
        ],
        "command": ["/entrypoint.sh", "bash", "-xec", "df /home/jenkins/agent/docker; docker version; docker info; docker pull gradle"]
      }
    ],
    "runtimeClassName": "sysbox-runc",
    "volumes": [
      {
        "name": "workspace-volume",
        "emptyDir": {}
      }
    ]
  }
}
'
If you don't see a command prompt, try pressing enter.
INFO[2021-10-06T16:41:37.991336236Z] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address 
INFO[2021-10-06T16:41:38.382940765Z] Loading containers: done.                    
WARN[2021-10-06T16:41:38.438402518Z] Not using native diff for overlay2, this may cause degraded performance for building images: running in a user namespace  storage-driver=overlay2
INFO[2021-10-06T16:41:38.438666453Z] Docker daemon                                 commit=79ea9d3 graphdriver(s)=overlay2 version=20.10.9
INFO[2021-10-06T16:41:38.438805321Z] Daemon has completed initialization          
INFO[2021-10-06T16:41:38.615777928Z] API listen on /var/run/docker.sock           
Client: Docker Engine - Community
 Version:           20.10.9
 API version:       1.41
 Go version:        go1.16.8
 Git commit:        c2ea9bc
 Built:             Mon Oct  4 16:08:29 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.9
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.8
  Git commit:       79ea9d3
  Built:            Mon Oct  4 16:06:37 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.11
  GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
[services.d] done.
+ df /home/jenkins/agent/docker
Filesystem                                                   1K-blocks     Used Available Use% Mounted on
/var/lib/sysbox/shiftfs/b3a26b5d-2d6c-40e7-a356-5e29903e7125 130550852 14374468 109501748  12% /home/jenkins/agent/docker
+ docker version
Client: Docker Engine - Community
 Version:           20.10.9
 API version:       1.41
 Go version:        go1.16.8
 Git commit:        c2ea9bc
 Built:             Mon Oct  4 16:08:29 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.9
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.8
  Git commit:       79ea9d3
  Built:            Mon Oct  4 16:06:37 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.11
  GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
+ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.6.3)
  compose: Docker Compose (Docker Inc., v2.0.1)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.9
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 5b46e404f6b9f661a205e28d59c982d3634148f8
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.0-70-generic
 Operating System: Ubuntu 20.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.34GiB
 Name: dind
 ID: 6IKK:XRTE:CM2G:NPQK:GAXA:ULRB:GKCA:6V5N:5IOM:WWLP:TJ7I:4TEP
 Docker Root Dir: /home/jenkins/agent/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
+ docker pull gradle
Using default tag: latest
latest: Pulling from gradle
f3ef4ff62e0d: Pull complete 
706b9b9c1c44: Extracting [==================================================>]  16.03MB/16.03MB
0fffb0c672b9: Download complete 
5a54c3905797: Download complete 
830009aaff35: Download complete 
a28d173c1d5d: Download complete 
INFO[2021-10-06T16:41:41.730316722Z] Attempting next endpoint for pull after error: failed to register layer: ApplyLayer exit status 1 stdout:  stderr: unlinkat /etc/ld.so.cache: operation not permitted 
failed to register layer: ApplyLayer exit status 1 stdout:  stderr: unlinkat /etc/ld.so.cache: operation not permitted
[cmd] setpriv exited 1
INFO[2021-10-06T16:41:41.736788167Z] Processing signal 'terminated'               
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
INFO[2021-10-06T16:41:41.787222533Z] Layer sha256:da55b45d310bb8096103c29ff01038a6d6af74e14e3b67d1cd488c3ab03f5f0d cleaned up 
INFO[2021-10-06T16:41:41.790357446Z] stopping event stream following graceful shutdown  error="<nil>" module=libcontainerd namespace=moby
INFO[2021-10-06T16:41:41.791369126Z] Daemon shutdown complete                     
INFO[2021-10-06T16:41:41.791390911Z] stopping healthcheck following graceful shutdown  module=libcontainerd
INFO[2021-10-06T16:41:41.791423211Z] stopping event stream following graceful shutdown  error="context canceled" module=libcontainerd namespace=plugins.moby
WARN[2021-10-06T16:41:42.792257234Z] grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix:///var/run/docker/containerd/containerd.sock: timeout". Reconnecting...  module=grpc
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
[s6-finish] sending all processes the TERM signal.
[s6-finish] sending all processes the KILL signal and exiting.
Session ended, resume using 'kubectl attach dind -c dind -i -t' command when the pod is running
pod "dind" deleted

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 37 (37 by maintainers)

Most upvoted comments

Truly a “graphical” description! Let’s hope we can make it all green in the new issue 😃

Yes, I see that now 😃

Will go ahead and close this issue. Re-open it if you see anything else related to this fix.

That’s really cool, nice approach and it would work for me. I would just suggest to document the fact that dockerd --data-root /other/dir (as cli option rather than the file) won’t be supported.

Yep, makes sense … will do.

Hi @felipecrs, thanks for the info.

We’ve found a simpler solution: when the container starts, Sysbox will look at the container’s /etc/docker/daemon.json file to understand where the inner Docker’s data root resides. It will then use this info to setup the mounts for the inner Docker correctly.

@rodnymolina will be working on this soon.

I installed Sysbox today through the daemonset, so I it’s the newest version:

$ sysbox-runc --version
sysbox-runc
	edition: 	Community Edition (CE)
	version: 	0.4.1
	commit: 	d540126188a1e8595c8f769aeb91833002c37b3a
	built at: 	Fri Oct  1 19:33:49 UTC 2021
	built by: 	Rodny Molina
	oci-specs: 	1.0.2-dev