moby: ARG before FROM in Dockerfile doesn't behave as expected
Description
It’s documented that ARG
can appear before FROM
, so that arguments may be substituted into image names etc.
Rather than having some ARG
before and some ARG
after FROM
, for consistency I attempted to place all my ARG
before FROM
. However, to my surprise (after a lot of debugging) I determined that my arguments are always blank after FROM
.
I believe the meta-arg functionality/refactoring may somehow be responsible:
https://github.com/moby/moby/commit/239c53bf836174108dbae445a394a290f5fe2898
Steps to reproduce the issue:
- Produce a Dockerfile such as:
ARG environment
FROM alpine:3.5
ENV ENVIRONMENT=${environment:-development}
RUN echo "$ENVIRONMENT" > /value_of_environment
- Build the image and run the image, printing the value of
environment
ARG (stored in /value_of_environment):
docker run $(docker build -q --build-arg environment=production .) cat /value_of_environment
Describe the results you received:
development
Describe the results you expected:
production
Additional information you deem important (e.g. issue happens only occasionally):
Altering the Dockerfile
such that ARG
comes after FROM
i.e.
FROM alpine:3.5
ARG environment
ENV ENVIRONMENT=${environment:-development}
RUN echo "$ENVIRONMENT" > /value_of_environment
then running again:
docker run $(docker build -q --build-arg environment=production .) cat /value_of_environment
gives the expected output of production
.
Output of docker version
:
Client:
Version: 17.06.0-ce
API version: 1.30
Go version: go1.8.3
Git commit: 02c1d87
Built: Fri Jun 23 21:31:53 2017
OS/Arch: darwin/amd64
Server:
Version: 17.06.0-ce
API version: 1.30 (minimum version 1.12)
Go version: go1.8.3
Git commit: 02c1d87
Built: Fri Jun 23 21:51:55 2017
OS/Arch: linux/amd64
Experimental: true
Output of docker info
:
Containers: 59
Running: 0
Paused: 0
Stopped: 59
Images: 370
Server Version: 17.06.0-ce
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 457
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.31-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 5.818GiB
Name: moby
ID: BCV5:MEMK:BYKI:I2IU:QY2V:5DRM:F2FP:JFAG:SM46:M2WJ:73YV:3KLP
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 20
Goroutines: 40
System Time: 2017-07-16T19:58:09.054157098Z
EventsListeners: 1
No Proxy: *.local, 169.254/16
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 40
- Comments: 31 (11 by maintainers)
Commits related to this issue
- build: define argument twice to get around docker arg issue The ARG Dockerfile keyword has a long history of not behaving as expected. See this thread for more detail: https://github.com/moby/moby/is... — committed to politics-rewired/Spoke by bchrobot 4 years ago
- nice bug in docker: https://github.com/moby/moby/issues/34129 :( — committed to solosTec/segw-build by mseemann 4 years ago
- nice bug in docker: https://github.com/moby/moby/issues/34129 :( — committed to solosTec/segw-build by mseemann 4 years ago
- OMG - ARG are reset after FROM See https://github.com/moby/moby/issues/34129 So many hours wasted... :( — committed to pmatos/jsc32-fuzz-setup by deleted user 3 years ago
- Dockerfile: move ARGs below FROM — committed to OCR-D/ocrd_all by bertsky 2 years ago
Irrespective of whether this was implemented this way intentionally or it’s a bug; I think it’s a bit of a usability nightmare.
It’s not clearly documented that this is the expected behaviour, and it makes for messy
Dockerfile
. But more importantly, it opens a pandora’s box of confusing edge-cases.What if I intend to use an
ARG
in both myFROM
statement and after it? Am I expected to have multipleARG
statements referring to the samebuild-arg
?What happens if I use default value syntax
ARG argument=some_value
beforeFROM
and justARG argument
afterFROM
? What is the expected value ofargument
afterFROM
if noargument
build-arg was passed?Improved documentation is always appreciated, and would have saved me some time. However, just because behaviour is documented doesn’t preclude the behaviour itself from scrutiny.
ARG
has too much complexity to it. I’d argue this functionality shouldn’t have been added to theARG
keyword in the first place, it’s effectively been repurposed and its behaviour is now far to nuanced. A new keywordFROMARG
from the on-set would have made a lot more sense.If you want to use the same
ARG
before and afterFROM
, simply re-declare it after, e.g.:As a new user of
ARG
it was very unintuitive why myARG
was empty. I saw someone use an example ofARG
in a Dockerfile, but they were using it in theFROM
line. For me it makes sense to define any parameterisation of a Dockerfile at the very top, so I didn’t question it. Only upon rereading the docs after reading this issue do I understand why.I would suggest a warning that
ARG
gets reset afterFROM
in the documentation, as not everyone is up to speed on multistage builds.I lost couple of hours to this. Intuitively I was expecting that ARG before FROM in multistage build will be a global ARG (for all stages). In simply gets cleared instead.
To be clear, I’m not saying I don’t understand how the current implementation works, what has been written in this issue explains it clearly enough. I’m suggesting the implementation itself is non-ideal and confusing; after all, I read the existing docs and literally cloned Docker compose, Docker client and finally Docker before working out what was going on - at which point I opened this issue.
It’s just too complicated. Adding so much complexity to the
Dockerfile
syntax and the corresponding documentation is simply not sustainable.I don’t think this is necessarily 100% accurate that multi-stage and
ARG
inFROM
are independent, they should have been independent, but I think the existence of multi-stage impacted the implementation ofARG
inFROM
.The properties of
ARG
were:It may appear after
FROM
.The argument defined by
ARG
may be used on any line following the definition.(
2.
is the wayDockerfile
s always worked, sequential, state is additive, never subtractive).A feature request comes along:
Reasonable enough, the two previously defined properties still hold if implemented. We now have a third property:
ARG
may appear beforeFROM
.This can cleanly be implemented, without any backwards compatibility issues. Except, it wasn’t; it could have been, but it wasn’t.
Instead, property
2.
was violated, suddenlyARG
can’t always be used after its defined. If it appears beforeFROM
, then it can only be used inFROM
, not on all subsequent lines.That’s changing the semantics of
ARG
, hence why I’m suggesting it should have beenFROMARG
, a keyword that can only appear in the “meta section” prior toFROM
.Mind you, this constraint is artificial in nature, there’s zero reason
3.
shouldn’t have been implemented cleanly. The only reason the current implementation was deemed acceptable is because multi-stage builds were also coming, and it was also violating2.
, albeit in a (roughly) well-defined fashion.Anyway, my issue is complexity; that’s subjective and given I’m not a maintainer, not for me to decide. Documentation is certainly better than nothing, so this issue may be closed if you see fit.
This is an over simplification. You are not considering default values and the programming rule of
one single source of truth
.We now have the arg’s default value defined twice in one file - we have lost the single source of truth.
I ran into the same issue and in order to underline the impact of that behaviour, I want so share my example here, whos cause took a significant amount of time to figure out. Still it’s totally unexpected and I wont exactlly call that user experience. Please, if you don’t see the necessity to change that bahaviour, then at least document it as the creator of this issue suggested, so that people can stumble upon this.
Works not as expected.
NPM_VERSION
holds"latest"
.Works as intended.
NPM_VERSION
holds"4.5.0"
.@thaJeztah I know that’s true now, I’ve experimented with it. The issue is that it’s hugely non-obvious.
If this is expected behaviour and no-one is willing to change it. Then at the very least
ARG
ought to be deprecated (beforeFROM
) and instead when used prior toFROM
the syntax should beFROMARG
(which must come beforeFROM
).The example given actually takes care of default values;
I also posted some examples in https://github.com/moby/moby/issues/37622#issuecomment-412101935, https://github.com/moby/moby/issues/37345#issuecomment-400245466
@thaJeztah correct me if I’m wrong.
@Benjamin-Dobell after investigating this, https://github.com/moby/moby/commit/239c53bf836174108dbae445a394a290f5fe2898 is not the origin of this behavior.
Basically, after the
FROM
instruction all the build arguments are reset and thus aren’t available in the Dockerfile.From what I found the purpose of
ARG
beforeFROM
is to use it inside theFROM
instruction https://github.com/moby/moby/pull/31352I should note, that I’m not actually an advocate of expanding the grammar when the usage of the existing grammar can be expanded.
However, in this particular instance
ARG
has had its existing semantics altered; the behaviour is not additive. Previously whenever you referenced anARG
defined argument you’d have access to the value as expected. Now argument interpolation is much more context aware.It’s extremely confusing in single stage builds, and perhaps more-so in multi-stage ones. If arguments really are tied to build stages (although I must confess I’m not sure why this is desirable), then you’ve suddenly a need to look at the previous “stage”, beyond the
FROM
verb.Realistically, you can’t pass different arguments to different build stages (they’re typically provided as CLI arguments). So there’s no legitimate reason to scope arguments to build stages. Additionally:
So there is zero incentive to intersperse
ARG
definitions through-out a file. Therefore, the most logical behaviour would be to encourage allARG
definitions to be placed at the top of a file (where they can clearly be seen) and then update the behaviour to ensure there’s no funny business with build stages.This is horrible to way with something that seems to be a global value. I have a dockerfile with multiple
FROM
statements and things are breaking because I can’t pass the arg values as I originally thought. Sure, maybe I should read the documentation a bit more but it seems I am not alone in expecting this behaviour (ARG being global) so maybe things should work as the MAJORITY think it should?I have a reverse twist on this. I remembered from the docs that ARG had to appear before FROM in order to be used in FROM, so I put an ARG before the FROM of my second builder declaration. And got an invalid-format error on the FROM line, because that ARG appeared after the first FROM in the file, and so was ignored when processing the second FROM line. So ARG-before-the-first-FROM is global for all FROM lines and not used in any other lines, while ARG-after-FROM is used only between that FROM and the next FROM. It is consistent in a way, but completely non-intuitive, so really the ARG-before-FROM ought to be named FROMARG as suggested earlier in this thread, because otherwise it just breaks expectations left and right.
As far as this keyword behaves with multiple FROM statements, in “multi-stage” builds, ARG lets you specify different defaults for different stages, but there is no way (nor should there be) to pass different values explicitly to different stages. That’s far more convoluted than having ARGs go into effect from the keyword down, across any number of stages/FROMs.
Yikes! That also needs documenting… and changing.
When looking at a
Dockerfile
, what syntax marks the beginning of a new build stage?FROM
does, and yet, somehow it accessesARG
defined prior to this line.@darrahts see https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact https://docs.docker.com/engine/reference/builder/#scope
If this is a common pattern a PR would probably be accepted that detects this case (at least for variable substitution) and shows a warning about possible misuse.
@Benjamin-Dobell I wanted to use build-args in multistage builds to pass secure keys to intermediate build stages which would then disappear. I haven’t completely got confirmation that this is secure, but I was actually happy to see your issue.
For the record, aside from implementation details which respondents seem to be burdening you with, clearing build args – at least so they can’t be read from the build history – seems IMO to be a very important feature… well worth the complexity.
UPDATE – sigh … I guess I spoke prematurely. Multistage builds don’t help with the fact that args are written to build history.
The new
ARG
features are 100% backward compatible. No previousDockerfile
needs any changes.It’s the opposite. Build args are defined by stage so you only need to look at the args for the current stage. Whatever you define in other stages has no effect on the current stage.
All args are used in every
RUN
command. If argument changes it breaks all cache from the very first timeRUN
is used.