moby: Restoring containers from a custom checkpoint-dir is broken
Copied from https://github.com/containerd/containerd/issues/2406
Description
This was first reported in moby/moby#35694 but no issue was ever created for it.
The containerd 1.0 integration broke the --checkpoint-dir option for restoring a container from a particular checkpoint location. This means this option is broken in docker releases 17.12 and up.
See https://github.com/moby/moby/commit/ddae20c032#diff-3cb140026df40998ea29c5bcb6bb292eR118
Steps to reproduce the issue:
- docker run --name crtest -d busybox /bin/sh -c ‘i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done’
- docker checkpoint create --checkpoint-dir /var/lib/docker/checkpoints crtest checkpoint1
- docker stop crtest (see moby/moby#35690)
- docker start --checkpoint-dir /var/lib/docker/checkpoints --checkpoint checkpoint1 crtest
Describe the results you received:
“Error response from daemon: custom checkpointdir is not supported”
Describe the results you expected:
Docker start with custom checkpoint-dir should succeed.
Note that C/R is currently broken in docker even without using a custom checkpoint-dir; see moby/moby#35691
Output of docker version
:
Client: Version: 18.05.0-ce API version: 1.37 Go version: go1.9.5 Git commit: f150324 Built: Wed May 9 22:17:48 2018 OS/Arch: linux/amd64 Experimental: false Orchestrator: swarm
Server: Engine: Version: 18.05.0-ce API version: 1.37 (minimum version 1.12) Go version: go1.9.5 Git commit: f150324 Built: Wed May 9 22:15:57 2018 OS/Arch: linux/amd64 Experimental: true
Output of docker info
:
Containers: 1 Running: 0 Paused: 0 Stopped: 1 Images: 4 Server Version: 18.05.0-ce Storage Driver: overlay Backing Filesystem: extfs Supports d_type: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: nvidia runc Default Runtime: runc Init Binary: docker-init containerd version: 773c489 runc version: 4fc53a81fb7c994640722ac585fa9ca548971871 init version: 949e6fa Security Options: apparmor Kernel Version: 4.14.43-041443-generic Operating System: Ubuntu 14.04.5 LTS OSType: linux Architecture: x86_64 CPUs: 2 Total Memory: 14.92GiB Name: ip-10-97-0-215 ID: BEEB:4M2D:QUZT:CXJW:WZ4H:WPYV:3BNT:ZGR6:ZGQU:S6ZM:EML5:SNC6 Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Labels: Experimental: true Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
AWS
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 5
- Comments: 30 (7 by maintainers)
Hi @harishanand95
The workaround would be to manually copy the checkpoint directory inside
/var/lib/docker/containers/<CONTAINER ID>/checkpoints/
For example:
I have a patch to fix this issue. I’m going to create a pull request, when https://github.com/containerd/containerd/pull/2425 will be merged.
@cnnrznn I was able to get this working in a newly created container by copying the checkpoint into the newly created container before running it. The following code snippet worked for me:
Anyone still using the workaround? Checkpoint does not seem to work if applied to a newly restored container. I tried the workaround posted by @rst0git and the scripts by @MihaelBercic of manually copying the checkpoint.
Steps to reproduce:
The container fails to pause after it is restored in a new container.
docker info:
Tested with criu 3.17 and 3.18
containerd
Hi @MihaelBercic, you should be able to use
docker create
to first create the container then move the checkpoint.However, I would recommend using Podman for container migration as it is actively developed, supported and maintained.
@adrianreber is the author of the checkpoint/restore functionality in Podman and also a CRIU maintainer. Adrian has a few very good talks and articles on this topic:
I hope this helps.
@avagin I wonder if that PR is ready to be made now that containerd/containerd#2425 is ready to roll? I can try to tackle it if not - thank you!