bazel: Downloading succeeds with a valid (cached?) hash but invalid URL
Description of the problem / feature request:
I’m noticing that download requests on bazel 0.12.0 (and 0.13.0) are succeeding with broken URLs, at least for java_import_external
. These are correctly failing in 0.11.0
Bugs: what’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
git clone https://github.com/google/bazel-common
bazel build third_party/java/truth
Switch this line to artifact = "com.google.truth:truth:0.blahblah",
bazel build third_party/java/truth
Note that this succeeds, even seemingly after bazel clean --expunge
. If you inspect @com_google_truth_truth//:com_google_truth_truth's jar (i.e.
bazel-bazel-common/external/com_google_truth_truth/truth-0.blahblah.jar), the maven information in that jar will still say version
0.39` (the original truth version).
Is it possible that the new caching feature is not working correctly?
I think this may be related to the fact that I haven’t changed the sha256
. If I do that, I get the right error (either that the mirrors are down if the version is 0.blahblah
or that the checksum is incorrect if it’s set to a real version, like 0.40
). Is there a lookup in some cache based on hash that ignores the URL entirely?
What operating system are you running Bazel on?
Linux
What’s the output of bazel info release
?
release 0.13.0
Any other information, logs, or outputs that you want to share?
Replace these lines with your answer.
If the files are large, upload as attachment or provide link.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 25 (22 by maintainers)
Commits related to this issue
- Use artifact id and sha1 for cache key type for maven_jar artifacts Previously the RepositoryCache was unable to provide any identification of cached entries except for the sha1 value itself, which m... — committed to jessehut/bazel by jhutton 6 years ago
- RepositoryCache: support "canonical id" Make the repository cache support the concept of a canonical id, i.e., files are not considered a cache, just because they have the correct content, if a diffe... — committed to bazelbuild/bazel by aehlig 5 years ago
- RepositoryCache: support "canonical id" Make the repository cache support the concept of a canonical id, i.e., files are not considered a cache, just because they have the correct content, if a diffe... — committed to irengrig/bazel by aehlig 5 years ago
- Implement artificial cache sepration To avoid repeated downloads, bazel has a cache of files, indexed by their (sha256) hash. If a file is requested with a hash already in cache, the file is taken fr... — committed to irengrig/bazel by aehlig 5 years ago
- Add `canonicalId` to the `Downloader` interface. When Bazel downloads an external file (via `ctx.download()` or similar, it supports the concept of a "canonical ID". This ID is used to disambiguate d... — committed to jmillikin-stripe/bazel by jmillikin-stripe 4 years ago
- Use the Maven artifact ID as the canonical_id in jvm_maven_import_external. This affects how downloaded Maven artifacts are looked up from the cache. Before this, cache lookups would only be based on... — committed to bazelbuild/bazel by a-googler 4 years ago
For anybody getting here, this issue is solved with the addition of the
canonical_id
parameter:Why not do both? Bazel could make a file named sha256(downloadUrl + sha256(fileContent)), which is a symlink to the current sha256(fileContent) filename. That would mean Bazel would have to download the file again when the url changes, but if the content stays the same it doesn’t need to update it. It would also work for the non-Maven case.
That seems like a bug
The situation is even worse, in case the artifact is changed to a valid value, say version is bumped, without changing the sha1: say from “com.google.guava:guava:18.0” to “com.google.guava:guava:19.0”, and this without changing the sha1. The user may think, that the newer version of the artifact is downloaded and all is fine, but actually nothing happened, except that the old artifact is retrieved from the cache and the file is renamd to “guava-19.0.jar”, but has the content of guava 18.0.
Repository cache activation per default in recent Bazel versions, should probably be reconsidered, until this behaviour is fixed.