bazel: Some `DexShardsToMerge`/`DexMerger` outputs do not get downloaded when using BwtB

Description of the bug:

When using Bazel 6.0 or later to build an android_binary (with D8, not defining dex_shards), a remote cache and setting --remote_download_toplevel the builds usually fail on the MergeDexZips action with errors like:

ERROR: ... Merging dex shards for //<foo>/app:app failed: (Exit 1): singlejar_local failed: ... No such file or directory
singlejar_local: src/tools/singlejar/input_jar.cc:23: Cannot open input jar bazel-out/darwin_arm64-fastbuild/bin/<foo>/dexfiles/app/3.shard.zip: No such file or directory

Looking at output files and timing profiles this seems to be because not all files of the dexfiles tree artifact are created, either by being downloaded from the remote cache by running the action locally.

This tree artifact is created by the DexShardsToMerge, which uses SpawnActionTemplate to create DexMerger actions. We can see this using aquery:

SpawnActionTemplate with output TreeArtifact <foo>/app/dexfiles/app
  Mnemonic: DexShardsToMerge
  Target: //<foo>/app:app
  Configuration: darwin_arm64-fastbuild
  Execution platform: @local_config_platform//:host
  Inputs: [bazel-out/darwin_arm64-fastbuild/bin/<foo>/app/dexsplits/app, bazel-out/darwin_arm64-opt-exec-2B5CBBC6/bin/external/bazel_tools/tools/android/d8_dexmerger, bazel-out/darwin_arm64-opt-exec-2B5CBBC6/bin/external/bazel_tools/tools/android/d8_dexmerger.jar, bazel-out/darwin_arm64-opt-exec-2B5CBBC6/internal/_middlemen/external_Sbazel_Utools_Stools_Sandroid_Sd8_Udexmerger-runfiles]
  Outputs: [bazel-out/darwin_arm64-fastbuild/bin/<foo>/app/dexfiles/app (TreeArtifact)]

It looks like this might happen when some files in dexfiles tree artifact get a remote cache hit and others do not. The ones that do not get a hit get created locally, the ones that do get a hit never get downloaded.

What’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

MacOS 13.2.1 & Ubuntu Focal Fossa

What is the output of bazel info release?

release 6.0.0-dev

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

This is a our patched version of the 6.0.0 release binary. We apply a few patches to improve performance of Android builds, fix dexing issues, (https://github.com/bazelbuild/bazel/pull/16369#issuecomment-1442802265) and some other issues. We build it with:

bazelisk build //src:bazel --compilation_mode=opt --embed_label="6.0.0-dev" --stamp

What’s the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

The last commit on the 6.0.0 release branch fixed a seemingly related issue: https://github.com/bazelbuild/bazel/commit/86883db929bebcd1e6d0507ed4b8799e6e7591da.

There is an ongoing work on BwtB https://github.com/bazelbuild/bazel/issues/10880.

I have tried reverting that commit and creating the binary on top of recent master (5900d6d248), but ran into the same issue.

Any other information, logs, or outputs that you want to share?

In the timing profile I see that Remote.download and Remote.downloadInMemoryOutput get logged for the files that do not get created. There is a start time for these, but no wall duration.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 17 (11 by maintainers)

Most upvoted comments

This is probably related to https://github.com/bazelbuild/bazel/issues/16333. I’m working on a PR to handle partial trees correctly in the prefetcher; expect it to be submitted within a few days.

Great to hear! Yes, this fix will be in 6.2.

FYI, I suspect there’s one remaining correctness issue in the presence of “templated tree artifacts” (in an Android context, that only applies to dex merging) + BwtB, as hinted at by this comment. I don’t know how easy it is to trigger in practice, but I’m also planning to fix it for 6.2, regardless. Just wanted to mention it in case you come across some other weirdness in the future.

@lukaciko Could you try patching #17678 and let me know if it fixes this issue?