moby: Parser in docker 1.13 breaks backwards compatibility for comments in middle of RUN statement

Description Previously (Docker 1.6+) one could document long RUN lines with comments interspersed. https://github.com/docker/docker/pull/24725 broke this.

FROM alpine
RUN echo \
# comment
hi

Steps to reproduce the issue:

  1. build the above Dockerfile on 1.13-rc2

Describe the results you received: Error response from daemon: Unknown instruction: HI

Describe the results you expected: Docker build the above Dockerfile on 1.6 through 1.12 (thanks @tianon for testing all of them) Successfully built xxx

Additional information you deem important (e.g. issue happens only occasionally): Quick list of some official-images broken by this change:

Output of docker version: (working version)

Client:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.7.3
 Git commit:   6b644ec
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.7.3
 Git commit:   6b644ec
 Built:        
 OS/Arch:      linux/amd64

Output of docker info: (working version)

Containers: 6
 Running: 4
 Paused: 0
 Stopped: 2
Images: 5715
Server Version: 1.12.3
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null overlay bridge host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.8.8-gentoo
Operating System: Gentoo/Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 30.87 GiB
Name: isengard
ID: UOCM:3F65:5FZC:6H5L:W3HY:34G4:A5XZ:SUOV:S2D4:XQTO:4KGA:6XSE
Docker Root Dir: /mnt/spare/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 4
  • Comments: 20 (17 by maintainers)

Commits related to this issue

Most upvoted comments

Just for reference, https://github.com/docker/docker/issues/28587 was another issue discussing this, but I think at least the comment half of this change is a really serious regression (and is going to be a pretty severe breaking change for existing Dockerfiles in the wild), since there is no good workaround for adding comments in the middle of a really long RUN line (whereas an intentional empty line can easily get a \ added to workaround that change).

So, let’s go for A for 1.13 (see #29064) and work on C afterwars.

There’s a couple of issues at hand here;

I think the original design for RUN was that RUN is “agnostic” to its content, because it doesn’t know what shell is used to execute the RUN statement in (i.e., everything after RUN should be treated as an opaque blob).

While this should probably be “true”, the implementation does not match that; before the content is sent to the shell, the Dockerfile parser / builder (in no particular order here);

  1. Removes empty lines inside the RUN
  2. Skips “comment only” lines (including comments not starting at position 0)
  3. Strips line-continuation markers (\)

Because of these steps, a RUN statement does not have the same behavior as a regular shell (see this gist)

Basically, comment-only lines are handled by Docker, comments preceeded by other characters are passed on to the shell.

Note that skipping comments not starting at position 0 is basically a bug, because comments in Dockerfiles should only be seen as comment if they start at position 0.

PR https://github.com/docker/docker/pull/24725 was created to fix cases where, because of a combination of 1. and 3., a trailing \ resulted in the Dockerfile continuing to see furter instructions as part of the RUN statement;

FROM busybox
RUN echo "foo" \
#&& echo "bar";



# Do something else
RUN echo "something else"

Resulting in;

Step 1 : FROM busybox
 ---> 47bcc53f74dc
Step 2 : RUN echo "foo" RUN echo "something else"
 ---> Running in 991b76486899
foo RUN echo something else
 ---> d488a4c21b22

I think we should fix that case, but we can consider it a “legacy” burden we have to carry. Which leaves us with some options;

A. Revert all changes

Include both empty and comment lines as part of the RUN statement. This would be a valid Dockerfile;

FROM busybox
RUN echo "foo"; \


# This is a comment

    # So is this

    echo "and this is part of the same RUN"


RUN echo "this is the next RUN"

B. Skip comment-only lines, but break on empty lines

This is what’s being implemented in https://github.com/docker/docker/pull/29006. Users can use a \ on an empty line to add whitespace, but an empty line stops the RUN.

With this change, this would still be valid;

FROM busybox
RUN echo "foo"; \
\
# This is a comment
\
    # So is this
    \
    echo "and this is part of the same RUN"


RUN echo "this is the next RUN"

C. Skip comment-only lines, warn / deprecate continuing on empty lines

Like B., but instead of breaking, print a warning (“skipping empty line”?), Or a deprecation warning (“WARN: skipping empty line. Empty lines inside a RUN instruction is deprecated, and will be removed in Docker 1.16”)

@davidfischer-ch yes, it has been overlooked 😓 😞. We are gonna work on it for the 1.13.0 GA so it does not break people Dockerfile… (@yongtang already proposed a changed 👼)

It also affects me. So there is no deprecation policy? Great.

@duglin https://github.com/docker/docker/issues/29005#issuecomment-264177285 the other reason for RUN spanning multiple lines is because it would be annoying to prefix every shell command with RUN keyword. i like the heredoc idea for that https://github.com/docker/docker/issues/29005#issuecomment-264181417

Oh sorry, I misspoke – my preference is for C, but I’d be fine with A. I think B is a bit too aggressive on the breakage scale for 1.13 alone.

High-order question (IMO): If we were starting from scratch would we want it to act like the shell does? Meaning, stop when it hits a blank line AND when it hits a comment?

In a perfect world I’d say “yes”. Which would then lead towards us deprecating the new behavior and then removing it in 3 releases.

However, there’s a catch. Not only do we have a lot of Dockerfiles in the wild with this “odd” syntax (which by itself doesn’t necessarily mean we should allow it to continue), BUT given there’s no good way to prevent multiple layers from being created when you have multiple RUNs, which means people are kind of forced into using RUNs that span a lot of lines via continuation chars, there’s no good way to allow people to be good citizens and add comments/docs to their long RUN cmds.

Now, if we had a “squash just certain layers” type of operation in Dockerfiles then people wouldn’t need to create these huge RUN cmds. They could have multiple RUNs, with comments between them, and then squash them down afterwards. Normally, in the past, we tried to align Dockerfile processing with shell processing, but this point (to me) is why we may not be able to align with the shell in this case.

Net: I think we should add back in support for comments in multi-line cmds but don’t deprecate it until (or ‘if’) we add support for selective squashing.

Any way we could do some querying on the public automated builds to see at least in the public aspect what the impact of this breaking change is?