moby: cancelling a pull causes partially or fully downloaded layers to be discarded

I’m working from an unreliable network connection and sometimes I need to pull large images. When I’m pulling a large image and the wifi disconnects and then reconnects, the download hangs. Then I have to ctrl+c to cancel the pull and try it again. When I do this, all layers pulled during the previous attempt have to be re-pulled from scratch. This means that if I never have a stable connection for long enough to pull the entire image without interruptions I can’t pull the image at all.

Output of docker version:

Client:
 Version:      1.12.1-rc2
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   236317f-unsupported
 Built:        Thu Aug 18 17:11:17 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1-rc2
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   236317f-unsupported
 Built:        Thu Aug 18 17:11:17 2016
 OS/Arch:      linux/amd64

This is a slightly hacked up version, but the changes are not related. It’s basically 1.12.1.

Output of docker info:

Containers: 10
 Running: 10
 Paused: 0
 Stopped: 0
Images: 110
Server Version: 1.12.1-rc2
Storage Driver: overlay2
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.7.2-1-ARCH
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.708 GiB
Name: compooter
ID: GKSI:YHZC:YPNI:PAGO:EYPD:AYME:K6BY:BAIH:EBKQ:SOAK:WUFP:6MJM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 62
 Goroutines: 113
 System Time: 2016-08-30T08:12:15.46410461-07:00
 EventsListeners: 1
Username: viktorstanchev
Registry: https://index.docker.io/v1/
Cluster Store: etcd://172.17.0.1:12379
Cluster Advertise: 172.17.0.1:12376
Insecure Registries:
 192.168.0.106
 dtr2-2061409770.us-west-2.elb.amazonaws.com
 new-1428125395.us-west-2.elb.amazonaws.com
 172.17.0.1
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.): physical laptop

Steps to reproduce the issue:

  1. pull a large image
  2. disconnect from the network and re-connect
  3. all layer downloads hang; ctrl+c to cancel
  4. pull again

Describe the results you received: pulls from scratch

Describe the results you expected: continues where it left off

Additional information you deem important (e.g. issue happens only occasionally): This behaviour makes sense because otherwise a mechanism for cleaning up unused partial downloads is necessary, but I think that keeping these partial downloads around for about a day or some other arbitrary time period is reasonable.

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 17
  • Comments: 35 (19 by maintainers)

Commits related to this issue

Most upvoted comments

Here’s what I was thinking:

  • By default the current behaviour makes sense.
  • Users on slow or unreliable connections should be able to tell docker that they want their downloads to be kept around after being cancelled and I’m open to suggestions for how to do that… Some ideas:
    • daemon flag
    • extra argument to docker pull
  • There should be some command to clean up old downloads manually
  • Maybe there should be a daemon setting to periodically clean up old downloads and a configurable interval

I can understand deleting partial downloads but complete layers should be kept.

@vikstrous I’d vote to add the download cache clean to docker system prune

Well, here am I am, in a place with poor connectivity again, Feb 28, 2019. Almost 3 years since opening this issue on Aug 30, 2016 and even after trying for a year to fix it myself. Still not able to use Docker in third world countries normally and running around to different restaurants and cafes to try to get some of that sweet American juice and convince Docker to pull some of those holy bits into Africa.

Any update on this issue? @vikstrous did you ever finish your fix?

I’m working on this… I’m going with the approach of always keeping partially downloaded files. We’ll need a command to clean up the cache though… unless we want to just leave that up to the users to do. If a download succeeds, the cache will be cleaned up anyway, so most of the time this is not going to be a problem.

It would be ok to, but we don’t currently use containerd’s content store in moby… which will just take some time to be integrated.

I would vote for Docker to ask by default when in an interactive session: “Do you want to keep the downloaded layers on disk to resume later?”

Docker is a remote daemon, where several users can be downloading the same layer concurrently. If one user says no and the other user says yes, what should the daemon do? How would one implement this?

To tell you the truth, we should just save the partials and allow them to be cleared out periodically.

I would vote for Docker to ask by default when in an interactive session: “Do you want to keep the downloaded layers on disk to resume later?”

containerd 1.0 is used in 17.12, but not the content store. We will be introducing those features from containerd to moby over the course of 2018.

In that case, as long as at least one user would be willing to keep the partial content it should be kept. That’s okay to always keep them too in my opinion. apt-get has the clean option to get rid of downloaded packages, maybe docker could also get a clean option as well in additional to the periodic cleaning for the cases where imediate clean-up is required…