containerd: CRI: Windows images created with "shell form" CMD in Dockerfile are broken
tl;dr
When Docker creates a Windows image with “shell form” CMD
/ENTRYPOINT
in the Dockerfile, the args in the image config are formatted differently than CRI expects. This breaks running the image with CRI since the process args in the generated OCI runtime spec are incorrect. Docker works around this by using a non-standard image config field ArgsEscaped
to indicate that different handling should be used for the args when the OCI spec is generated.
I’m not sure if this is something we can fix easily in Docker’s builder code. Even if that can be fixed, there are probably still many images out in the wild that are already affected. The easiest approach here is probably just getting ArgsEscaped
added to the OCI spec, and updating containerd to use it.
Background
On Windows, the command to execute in a container needs to be given as a single CommandLine string. This is different from Linux where an array of string args is given (as used by execve
). In a Dockerfile, there are two directives to control what command is executed in the container: ENTRYPOINT
and CMD
. They are both affected by this issue, but we just focus on CMD
below for clarity.
CMD
can be specified in either “exec” form:
CMD ["cmd.exe", "/S", "/C", "foo.cmd"]
Or in “shell” form:
CMD foo.cmd
In “shell” form, there is a shell command prepended to the command (e.g. cmd.exe /S /C
).
In a container image, the default command to run is stored as a JSON array:
"Cmd": [
"cmd.exe",
"/S",
"/C",
"foo.cmd"
]
However, there is an issue with Docker-built images, where if using the “shell” form of the CMD
directive, the args are all placed in a single array element:
"Cmd": [
"cmd.exe /S /C foo.cmd"
]
Now, when CRI generates an OCI runtime spec for the container, it needs to produce a single command line string. To do this from the args array, it enumerates the args and concatenates them, but first it escapes each item. This is important in case individual args container spaces:
["cmd.exe", "/S", "/C", "C:\Program Files\MyApp\foo.exe"]
becomes
cmd.exe /S /C "C:\Program Files\MyApp\foo.exe"
However, this behavior is incorrect when working with a container image produced using the “shell” form of CMD:
["cmd.exe /S /C foo.cmd"]
becomes
"cmd.exe /S /C foo.cmd"
(note the quotes)
This has the undesired effect of attempting to locate a binary named cmd.exe /S /C foo.cmd
and run it, which fails with the The system cannot find the file specified.
.
Docker behavior
When Docker builds a Windows image with a “shell form” CMD
or ENTRYPOINT
, it puts all of the args into a single element in the Cmd
field in the image config, and sets a non-standard field ArgsEscaped
to true.
At run time, the value of ArgsEscaped
is used to determine how to format the process args in the OCI runtime spec. If ArgsEscaped
is false or not present, the standard behavior is used (spec.Process.Args
receives an array of args). However, if ArgsEscaped
is true, Docker instead populates spec.Process.CommandLine
with a string containing the exact command line to run (CommandLine
is a Windows specific OCI runtime spec field).
The Docker change to introduce this behavior was made in this commit: https://github.com/moby/moby/commit/20833b06a0a41602001d595b3e1785248a352991 The main point of interest is here: https://github.com/moby/moby/commit/20833b06a0a41602001d595b3e1785248a352991#diff-6688f4342adf127b206582942bc147a0efab01c2e376b8a1a81e62c4bfee3ce1R242-R249
Repro
This issue is fairly simple to repro.
Build an image in Docker that uses “shell form” CMD
Dockerfile:
FROM mcr.microsoft.com/windows/nanoserver:1809
COPY foo.cmd /foo.cmd
CMD foo.cmd
ping -t 127.0.0.1
Run build:
> docker build -t bug .
[...]
Successfully tagged bug:latest
Export the image from Docker and import into containerd
> docker image save -o bug.tar bug
> ctr --namespace k8s.io i import bug.tar
Run a container from the image
> $p=crictl runp --runtime runhcs-wcow-process pod.json
> $c=crictl create --no-pull $p container.json pod.json
> crictl start $c
time="2021-02-22T23:34:08-08:00" level=fatal msg="Starting the container \"355a1e78037b8b38495e7bdf738728c9e23ea338b6a449d707706897eee4bcfd\" failed: rpc error: code = Unknown desc = failed to start containerd task \"355a1e78037b8b38495e7bdf738728c9e23ea338b6a449d707706897eee4bcfd\": hcsshim::System::CreateProcess 355a1e78037b8b38495e7bdf738728c9e23ea338b6a449d707706897eee4bcfd: The system cannot find the file specified.\n(extra info: {\"CommandLine\":\"\\\"cmd /S /C foo.cmd\\\"\",\"User\":\"ContainerUser\",\"WorkingDirectory\":\"C:\\\\\",\"CreateStdOutPipe\":true,\"CreateStdErrPipe\":true}): unknown"
If we look at the image config we can see ArgsEscaped
is set:
> ctr --namespace k8s.io content get sha256:<imageid> | convertfrom-json | convertto-json
{
[...]
"config": {
[...]
"Cmd": [
"cmd /S /C foo.cmd"
],
"ArgsEscaped": true,
[...]
},
[...]
}
Fix
We could potentially fix this for future images with a change to Docker to make it format args differently. However, I see two concerns with this approach:
- The change to Docker that introduced this behavior was already an attempt to fix other issues with Windows process arg escaping. Since clearly this is an area that has introduced bugs in the past, we risk more problems by changing it yet again.
- Even if we can fix this in Docker’s builder, there are presumably many images that have been built since this change was introduced (~2 years ago) that will still not run in containerd with CRI.
The easiest fix here is probably to get ArgsEscaped
added officially on the OCI image config, and then have containerd duplicate Docker’s behavior in this regard.
Interested in discussion on how else we might address this.
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 8
- Comments: 20 (10 by maintainers)
Any progress on this? This appears to be affecting Windows container images deployed to Azure Container Instances.
Fixed by https://github.com/containerd/containerd/pull/8198 , https://github.com/containerd/containerd/pull/9317 (fixing a some changes that missed making it to pkg/cri/sbserver and containerd/main has not switched to sbserver and sbserver has been renamed to pkg/cri/server) . Closing this github issue
This impacts the RUN instruction as well which also has both SHELL and EXEC forms.
https://docs.docker.com/engine/reference/builder/#run
I would expect this form to work. Is it possible you have another
CMD
orENTRYPOINT
elsewhere in your Dockerfile or base image which still uses shell form?If you don’t want to rebuild your image, you can also work around this by overriding the image
Entrypoint
/Cmd
with a command set via the k8s pod spec.Hi Justin 😃
Yes, unfortunately the issue here is a bit more widespread than I first realized.
I’m hoping to get to work on this again soon. The idea was to make a change so containerd respects the existing
ArgsEscaped
image config field (even though its not in the OCI image spec), and then longer-term work on a more standardized approach with OCI.