go: cmd/go: do not download “modules” that contain no go.mod or *.go

At the moment, go mod download will happily try to extract and download any arbitrary repository as long as it can be resolved by some means (through a hard-coded hosting service such as github.com, or using a distinguished extension like .git), even if it does not contain anything even marginally related to building Go code.

I am not aware of any reasonable use-case for such a repository:

  • It’s not useful for storing test data, because we currently provide no mechanism for the tests to actually locate that data. (Modules are not guaranteed to be loaded from the module cache — for example, they might be subject to a replace directive — and since the test itself is run within the directory containing its source code, it has no way to locate the data or run go list within the module that invoked it.)

  • It’s not useful for C headers (for use with cgo), for the same reason.

  • It might be useful for fetching non-Go inputs to go generate: in theory, the generator could run go mod download $MODULE to locate the sources at the required version. But the output of go generate is intended to be checked in anyway, which makes the use of modules somewhat spurious: if an explicit version of the non-Go inputs appears in the module’s requirements, then everyone using the generated package will have an extra module to fetch that is guaranteed to have no effect on the build, and in most cases the go generate program can just as easily git clone (or similar) the input data at a specific revision.

Furthermore, if someone did find a way to make modules without Go source code useful (for the above use-cases or others), it’s trivial to add a go.mod file to indicate that the repository really is somehow intended for use with Go source code. (We need to support go.mod-only modules anyway, since they can arise naturally when splitting a large root module into smaller nested modules.)


On the other hand, module proxies tend to rely on the go command to decide what is or is not a valid module, and accepting arbitrary non-Go repositories potentially exposes such proxies to a significant amount of additional load.


Therefore, I propose that we change the go command to explicitly reject any “module” that both contains no .go source files and lacks a go.mod file.

CC @rsc @jayconrod @heschik @hyangah @katiehockman @thepudds @marwan-at-work @ianthehat

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Reactions: 12
  • Comments: 19 (17 by maintainers)

Most upvoted comments

3.1 GB of data in go/pkg/mod

Got bit by this today while trying to find out why my disk space was gone. The biggest offender seems to be the linux kernel tree. Like the actual upstream linux kernel somehow.

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

A couple of people from the Gentoo Foundation were asking some seemingly related questions in #51284, including https://github.com/golang/go/issues/51284#issuecomment-1334221407.

It might be worthwhile for someone from the core Go team to make a brief comment there, including in the context of this proposal here.

On the other hand, module proxies tend to rely on the go command to decide what is or is not a valid module, and accepting arbitrary non-Go repositories potentially exposes such proxies to a significant amount of additional load.

As @seebs pointed out, unless there is a magic git or vcs command to do this cheaply without cloning, module proxy already has downloaded the repository when the go mod download concluded there was no go code. (still there is a question on whether it’s acceptable for the module proxy to stop serving such modules, but that is a separate issue).

If this proposal is accepted and implemented, I think module proxies need a way to distinguish this mode of failure from other go command failures (network issue, etc) so they know that they don’t need to retry and download the requested versions again in the future.