go: cmd/go: do not cache tool output if tools print to stdout/stderr
Update, Oct 7 2020: see https://github.com/golang/go/issues/27628#issuecomment-702252564 for most recent proposal in this issue.
What version of Go are you using (go version
)?
tip (2e5c32518ce6facc507862f4156d4e6ac776754f), also Go 1.11
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env
)?
darwin/amd64
What did you do?
$ go build -toolexec=/usr/bin/time hello.go
# command-line-arguments
0.01 real 0.00 user 0.00 sys
# command-line-arguments
0.12 real 0.11 user 0.02 sys
$ go build hello.go
# command-line-arguments
0.01 real 0.00 user 0.00 sys
# command-line-arguments
0.12 real 0.11 user 0.02 sys
$
What did you expect to see?
The second invocation of go build
doesn’t have -toolexec
, so it should not invoke the toolexec command (which I think it doesn’t), nor reprint its output.
What did you see instead?
toolexec output is reprinted.
In fact, I think it probably should not cache at all if -toolexec
is specified, since the external command that toolexec invokes may do anything, and (intentionally) not reproducible.
cc @dr2chase
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 5
- Comments: 33 (31 by maintainers)
Thinking about my comment above a bit more, I think this issue should still be fixed by simply doing no caching when
-toolexec
is used. That would be strictly better than what we have right now.Once this issue is fixed, I can then file the “opt-in caching for
-toolexec
” idea as a separate proposal.I actually don’t think those are the right choices. There is a plausible use of
-toolexec
to invoke a debugger or other interactive program, in which case we should never cache if-toolexec
is used. And there is a plausible use of-toolexec
to invoke a logger or other build annotator, in which we should completely ignore-toolexec
and its argument for caching purposes. I can’t think of an important use case in which we want to cache based on whether we are using the same-toolexec
option as before.I would probably pick “never cache if
-toolexec
is used” but I don’t feel very strongly about it.Counter-proposal: any time a command prints to stdout or stderr, its results are not cached.
The most common time a tool prints to stdout/stderr are failures, and those are already not cached because of the non-zero exit status. The new rule would generalize that: any print cannot be cached.
At first this seems a bit surprising and more tied to tool behavior than you’d expect, but it turns out that it would remove a bunch of special case code that been introduced since the build cache was first added.
The original build cache didn’t save stdout/stderr - it just didn’t seem necessary. One unintended effect was that -toolexec=time did not pollute future build output with cached timing prints.
But then
go build -gcflags=-m errors
didn’t reprint the -m output on a second run (#22587). We fixed that in CL 77110 by caching and reprinting the stdout/stderr too.But then we still missed stdout/stderr for certain skipped compile+link steps (#23877). We fixed that in CL 128903 with even more special logic.
And now
go build -toolexec=time
prints the timing output without -toolexec, which is kind of the opposite of the-gcflags=-m
issue.(There were probably other issues I am forgetting.)
Instead of continuing to finesse the exact behavior of when stdout/stderr are cached or not, why not turn back the clock and re-fix #22587 by saying: “don’t cache build artifacts when a command prints anything”. Then #22587 is fixed, #23877 doesn’t happen, this issue doesn’t happen either, -toolexec stays independent of the cache decision, and -toolexec=toolstash stays fast.
And it would let us delete all the output-handling code in the build cache (the test cache is separate and would stay).
Thoughts?
No change in consensus, so accepted.
I vote for this.
Based on the discussion above, this seems like a likely accept.
I agree that this bug needs fixing one way or another. The only way I find
-toolexec
useful right now is like-a -toolexec=whatever
.I have some thoughts about caching, though. It’s unfortunate if we simply throw caching out the window, because it’s entirely reasonable for toolexec programs to be deterministic given their input.
The way
go build
handles caching properly forgo tool compile
seems to be to first ask itcompile -V=full
, and use that as part of the cache key. This means the compile operations will need re-doing if the compiler version changes, but not otherwise.Could we do the same for
-toolexec
as an opt-in? For example, if one doesgo build -toolexec=mytool
, the Go tool would run a well defined version flag likemytool -toolversion
, similar to the compiler’s-V=full
. If that fails, we do no caching at all with-toolexec=mytool
. If it succeeds, its output is added to the cache key, and caching takes place.I think this is the best of both worlds, because current non-deterministic toolexec programs would continue to use no caching at all, while other Go-specific complex tools could take advantage of the build cache.
We would have to define the bare minimum of information for
-toolversion
to report, though. For example, would a stringified https://golang.org/pkg/runtime/debug/#Module be enough, if available? Or perhaps a hash of the file contents in https://golang.org/pkg/os/#Executable?