go: cmd/go: go list has too many (more than zero) side effects
<rant>
This is quite honestly becoming (literally) rage-inducing so I’ll keep it short…
Since the introduction of the package cache and modules, go list
has gained some (IMO) nasty side-effects like downloading things from the internet and compiling (CGO) packages when all I did was ask it to print the import path of the current package.
Additionally, package querying shouldn’t result in updating any files. This coupled with the fact that GOPATH/mod/...
is readonly means that if you’re in a package inside the mod cache and run go list
it might fail because it can’t write to go.mod
(why is it updating the file?!?!?!).
This script that creates a go.mod
file with an extra empty line at the end demonstrates the latter issue:
$ bash -c 'cd $(mktemp -d) && echo "package app" > app.go && echo -e "module app\n" > go.mod; go list -mod=readonly'
go: updates to go.mod needed, disabled by -mod=readonly
To make matters worse, go/build
suddenly started calling go list
so a simple operation like .Import(...FindOnly)
that used to take no more than a couple milliseconds, now takes several seconds for no good reason… all because the go
tool decided it was going download things from the internet, compile things and god knows what else… all manner of surprises I didn’t ask for.
Usually I’d just code my way around it with the power of NIH, but the behavior of go/build
and package lookups and querying in general is un(der)-documented and I don’t want to have to keep track of whatever new magic it mightwill gain in the future.
I doubt any of this is ever going to be fixed, so it’d be nice if these things were documented so I could answer questions like “given an import path, how do I go about finding it in GOPATH
, vendor
, module cache
, build cache
, etc.?” without having to rely on some broken black box.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 33
- Comments: 48 (32 by maintainers)
Why should
go list
fix formatting or remove redundant declarations? Zero really does seem like the right number of side-effects forgo list
.Chiming in from #30090 - I found it counter-intuitive that
gopls
- the language server - did network activity due to the underlying use ofgo list
. In this case, while I was editing code, it resulting in me being prompted many times to unlock my SSH keyring, due to this git configuration which enables access to private repositories:Unfortunately, entering my SSH password into the prompt was not sufficient, it still resulted in the secure password entry overlay being spammed repeatedly. Choosing to not unlock the keyring was likely was the cause of other problems (#30090). So the user experience wasn’t great there.
The circumstances under which it was doing network activity in #30090 came a bit of a surprise to me, because the code in question didn’t have any dependencies which weren’t already present in the module cache. It did eventually turn out that
go mod tidy
added to go.mod a dependency introduced through a test of a transitive dependency. Thereaftergopls
doesn’t appear to need to access the network.IMHO the only commands that should EVER attempt to download things from the internet, modify the go.mod or go.sum files are the
go mod ...
commands orgo get ...
.If the command fails because of a missing packages, great print out that you need to run
go mod verify
or some such, do not just blindly pull things from the network and naively assume (you know the joke here) that it is safe to change the version info specified in go.modAlso ALL
go ...
commands exceptgo mod ...
orgo get ...
need to respect the -mod=… flags. Currently you cannot set GOFLAGS=-mod=vendor because many go commands (notablygo list ...
andgo tool ...
) do not understand that flag, then they go off and start downloading crap and mucking with my go.mod file.please go back to the drawing board with go modules so we do not need to keep adding hacks (like https://github.com/kubernetes/kubernetes/blob/master/go.mod needing to
replace
every single dependency) and other workarounds for a broken by design system.Speaking as someone who’s subscribed to this repo and sees every issue and comment, I must say this response epitomizes so well the tragedy that has become Go…
Sorry, you’ve gone full pedantic and lost me.
From a practical POV, the vendor/ directory is my way of saying “use THIS code, not some random crap you find on the internet” and ensuring that everyone in my developer community gets the same result. If that requires me to set “replace” directives in go.mod (which seems to be the case), then OK (though, IMO, that is frankly a bit silly).
We have not enabled modules yet (working on it). Switching from
go/build
togo/packages
caused a 50x slowdown. I’ll open a new bug for that.I get what you’re saying and also I don’t think you can mark anything as incompatible as a library. Last I checked only the top-level go.mod has the ability to exclude stuff.
It’s definitely true that the go modules system depends on people actually respecting semver.
Personally I don’t think it’s that big of a problem. I’ve thus far successfully lived with the old Go compatibility guarantee where I just vendor the master tip of every library and rarely have I had issues.
I think the new go modules system is a big upgrade over the old master tip system. Is it the be-all end-all version management system? Probably not, but it’s a start.
As far as reproducible builds go, they are achievable with the go modules system. If you have a fully defined go.mod and things work - then things will continue working. No new dependency update is ever going to creep into your project at a surprising moment. You can run go build 5 years down the line and it will work the same way as it did yesterday. This is a very strong property of the go modules system. You get to control when you want to deal with the drama of updating dependencies and figuring out which library failed to follow semver.
@balasanjay fine add go build to the list of commands that will pull dependencies down. It should however NEVER modify the version of dependencies listed in go.mod only the
go get ..
orgo mod ...
commands should do that.I cannot count the number of times running
go build
and forgetting the-mod=vendor
flag has broken my build because it decided it wanted to update to a broken version of some lib I use. The go module system was supposed to make it so I do not need to vendor and checkin my code yet I still have to do that and I still have to ensure I am not calling things likego list
,go fmt
, etc in my ci pipeline since that will suddenly pull in and upgrade version of dependencies even though that is clearly not what I intended by calling those commands.With the QT project I’m working on, due to VSCode Go extension using
go list
, when I fire up the editor lots of cgo/gcc processes are started in the background. QT being a massive project, the compilation takes a lot of time and the computer resources are depleted completely. It comes to point where I can’t even move the mouse anymore. So just to let everyone know,go list
became a DoS attack.Edit: Here is what I’m talking about
@myitcv
We are prepping to move Kubernetes to modules. I saw this thread and thought I would get a jump on one of the build tools, and follow your guidance.
I tried a very simple conversion of https://github.com/kubernetes/kubernetes/tree/master/hack/make-rules/helpers/go2make to use
go/packages
and it changed from taking 900ms withgo/build
to taking > 41s withgo/packages
for a single cmd/ dir. All of the time is underpackages.Load()
.We vendor EVERYTHING so there should be no need for any sort of network traffic at all.
Am I misunderstanding something? I thought the point of vendoring was to avoid any need for network callouts and to make builds totally hermetic, reproducible, and offline-safe.
@DisposaBoy previously, multiple calls to
go/build
were effectively zero-cost. In the new world, these are replaced by a single call togo/packages.Load
.If you continue to use
go/build
, in certain usage patterns you will end up making multiple calls togo list
, which, even for relatively small projects, can become costly.go/packages
has come into existence to provide an abstraction layer atop various drivers. There is a driver for thego
command, just as there is for build systems like Bazel, Blaze and others. All efforts for optimisation are therefore directed viago/packages
.So the first step is moving away from
go/build
togo/packages
(which can do everything from simply resolving package patterns to loading fully type and syntax information (see https://godoc.org/golang.org/x/tools/go/packages#LoadMode).If you are still seeing issues after moving to
go/packages
, then we can certainly help to diagnose further. There are a number of things you might be running into, but narrowing this down to a singlego/packages.Load
call will help.Issues that spring to mind include the aforementioned https://github.com/golang/go/issues/29427, https://github.com/golang/go/issues/28739. The latter is hopefully going to be addressed by an upcoming CL that works by caching directory/file operations where possible.
@balasanjay the
-mod
flags cannot be set inGOFLAGS
since most of the go commands do not actually work with it even though they still modify the go.mod file.Currently
go build
is not idempotent in any way shape or form. It relies 100% on the libraries you are using to follow it’s definition of semver and assumes that any potentially breaking change properly does a major version bump, which is complicated for libraries of libraries that your project also uses.Say Your project used libA(1.2.3) & libB(1.10.2) and libA also uses libB(1.10.2).
Monday libB updated to version 2.0.0 from 1.10.2, you are all good since the module system will not auto update to a new major version.
Tuesday libA updates to use version 2.0.0 of libB and bumps their version to 1.2.4 since this is just a performance improvement from their perspective. None of their public interfaces or APIs changes in any way.
Well now your build is broken because the module system will auto update to libA(1.2.4) which will pull in libB(2.0.0) which is incompatible with your own use of libB.
This scenario gets worse when teams do not understand they released a breaking change or are still releasing a v0.x.x library that is very widely used or do not use semver at all. Implicit modification of dependencies is never the right choice, it should always be an explicit decision on the developers part to update or modify their dependencies.
I believe that the aforementioned issues address all of the actionable problems reported in this issue thread, but it’s been a long thread so it’s possible that I’ve missed one.
We try to keep open issues in the Go issue tracker actionable and focused on a single problem or decision, and apart from the above issues this thread is not, so I’m going to close it out.
Please do continue to file new issues for specific use-cases where the behavior of
go list
poses a problem.Caveat: I have no experience with these issues personally (my personal projects are small enough to not have CI, and my professional projects use a different build system entirely).
That said, from reading rsc’s articles, the impression I got is that
go build
will make any modifications necessary to ensure that the next invocation ofgo build
will yield the same result.In other words, it will not “upgrade” your dependencies willy nilly, but if your dependencies are insufficiently specified such that some change by someone else could change the meaning of your build, then it will change your
go.mod
file such that it completely describes your build. Idempotence of build commands seems like an important property.I think if you’d rather never let it touch your
go.mod
, then the incantation is-mod=readonly
, which btw can be put in the GOFLAGS environment variable if you’d like that to be the default behaviour.@thockin
No, the main point of vendoring is to distribute proprietary code. (The root “vend” is right there in the name!) If you want to make builds hermetic, reproducible, and offline-safe, all you need in module mode is a
go.mod
file with a complete set of dependencies — that is, one for whichgo mod tidy
is a no-op — and a module cache or proxy containing the relevant versions of those dependencies.That said, #30240 would indeed avoid the need for most (but perhaps not all?) network access for fully-vendored code.
So I could certainly buy the argument that
go list
shouldn’t make cosmetic modifications, and perhaps it should not report an error if it failed to write updates (particularly to thego.sum
file, since that doesn’t affect reproducibility), but I don’t at all buy the argument that it should not make any modifications at all.@xStrom if you start using another bit of code that requires at least 1.2.4 of course there will be an auto update. go.mod versions are not a lockfile, they are the minimal state the module engine will try to achieve. That’s a try not a hard promise.
There is no “auto update” to libA(1.2.4). What are you talking about? If your go.mod contains libA(1.2.3) then it will stay at that version. You have to manually update via
go get -u
to get libA(1.2.4).On top of that, libB(2.0.0) must be able to run in parallel with libB(1.10.2). Most libraries have this property by default. The exceptions are libraries that have some sort of global state, like listening to a specific port. This is a known pitfall and there are solutions against it. Basically the library authors have to respect the Go module system rules, among which is that major versions have to be able to run at the same time.
@crvv
Both of
go vet
andgo list
answer questions about your dependencies (in the same way thatgo build
builds your dependencies), so rsc’s logic clearly applies there as well. As for the other two, I have no idea, but this bug doesn’t appear to be about those other two.And my question to @thockin was not related to code downloading, I was trying to understand k8s’ use of vendoring.
@djgilcrease In https://research.swtch.com/vgo-cmd, @rsc said that the following sequence of commands represents a suboptimal developer experience:
So it seems your proposal was considered, and explicitly rejected. (rsc’s example also included a git clone, but the core of his argument doesn’t seem specific to that)
go list -mod=readonly -e
could perhaps do that.@josharian
Perhaps it shouldn’t, but that still doesn’t lead to “zero side effects”.
For example, we would like
go list all
to be idempotent and fast: if you run it twice, to the extent possible you should get exactly the same results, and any expensive operations (such as network lookups) from the first run should not be repeated for the second run.If
go list
is not allowed to modify thego.mod
file at all, we either lose idempotence, or we lose the property that you can (in general) edit code in module mode in the steady state without needing to explicitly modify your module definition.For example: suppose that you add an import of
golang.org/x/oauth2
in your program. You rungo list all
, and it resolves some set of transitive dependencies viaoauth2
, includinggolang.org/x/net
— but sinceoauth2
doesn’t currently have ago.mod
file, you get whatever version ofgolang.org/x/net
happens to belatest
at the moment, andgo list all
includes the packages contained in that version.If
go list
doesn’t update thego.mod
file, then the next run will need to re-resolve thelatest
version (incurring another network fetch), and if any packages were added in the interim that will change the output ofgo list
: we would lose both speed and fidelity.In contrast, if
go list
does update thego.mod
file, then the next run will not only produce the same output, but will also avoid the network operation (since the active version ofgolang.org/x/net
is now cached locally).