moby: --cache-from may corrupt config.Cmd
Output of docker version:
$ docker version -f '{{ .Server.Version }}'
18.05.0-ce
Also tested with 18.03.1-ce, same issue.
Consider the following script: https://gist.github.com/flx42/97b9f20e2ea04013843955f86e780254 These are the last two lines of output:
+ docker inspect test -f '{{ .Config.Cmd }}'
[/bin/sh -c #(nop) ENV VAR=value]
This is not a valid CMD, and this doesn’t match docker history:
$ docker history test
IMAGE CREATED CREATED BY SIZE COMMENT
1e24e7c47bd6 2 minutes ago /bin/sh -c #(nop) ENV VAR=value 0B
3fd9065eaf02 5 months ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 5 months ago /bin/sh -c #(nop) ADD file:093f0723fa46f6cdb… 4.15MB
It seems --cache-from, --label and serialization (push/pull or save/load) do not play well together.
If you remove the --label from the first build, it works.
If you remove the save/load (and keep the image locally), it also works.
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 4
- Comments: 18 (15 by maintainers)
Today I ran into this very issue. I had to override the command in docker-compose to get the service running.
I really, really, REALLY hope this would get fixed. While most people run CI systems and have replicas & blue/green deployments etc, this could still potentially bring someone down. Or like us, it did bring down the CI/staging system down which itself was a big bummer.
And if you barely know how to use docker, you might never understand why suddenly your container isn’t running as it is supposed to.
So a very big +1 for a fix.
@mareksuscak 19.03(or buildx) supports “inline” cache mode (in addition to cache manifests) if you want to use
--cache-fromwith the production images. https://github.com/docker/cli/blob/8b7c86d5b082e72cfdf5b0d0909311eacb2ff88a/docs/reference/commandline/build.md#specifying-external-cache-sourcesBuildKit is not affected by this as it doesn’t do any config pattern matching at all. In BuildKit you would point
--cache-fromto a special cache manifest that only has the cache checksum graph.Now, regarding the code path. The function
restoreCachedImageseems to be called only when using--cache-fromwith an imported cache. This is the config of the “restored” layer: https://github.com/moby/moby/blob/4d6219264655bbb4510642e40d7914fd56962596/image/cache/cache.go#L117-L130However,
cfgis the container config. Therefore it includes the(nop)prefix, I guess for cache comparison reasons, it’s added here: https://github.com/moby/moby/blob/4d6219264655bbb4510642e40d7914fd56962596/builder/dockerfile/internals.go#L307-L313I think we are definitely losing the reference to the right
Cmdhere: https://github.com/moby/moby/blob/4d6219264655bbb4510642e40d7914fd56962596/builder/dockerfile/internals.go#L410 Because it’s passingrunConfiginstead ofdispatchState.runConfig(rightfully, for cache comparison).Since the bug appears to be involving a combination of two different packages for one specific case (“restored” cache). I didn’t see an obvious fix.
Therefore, I think the options are:
ConfigAPI type to add the savedCmdonly for this particular case. Feels extremely kludgy.Cmdto a label before calling the cache. Still feels like a hack.Thoughts @thaJeztah @tonistiigi?
I had a little bit of time to look into the root cause in the code. First of all I’ve come up with a case that’s a more serious bug:
The output:
So, if you do
docker run --rm -ti -v /tmp:/tmp test, it will delete your files in the host/tmp, not what you expected. Granted, you would have to be unlucky to end up in this situation, but that would be very puzzling if that happens.