bazel: [6.0] accelerated file corruption on macos with Bazel 6.0

Description of the bug:

After updating to Bazel 6.0 I have observed significant disk cache corruption on macOS. A handful of previously working rule outputs started show up in the disk_cache often with size 0 or in a rare case was partially written. The rules work fine without a disk_cache and on Bazel 6.0 are working fine under load.

There were no other changes to the rule in question. Further, the rules are scoped to no-remote.

I wrote accelerated in the title because after further investigation I have reason to believe it might have happened on 5.3.2 to the same rules.

What’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

macOS 12.5.2

What is the output of bazel info release?

6.0.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What’s the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

Bazel 6.0 potentially related changes

I noticed that this change was added in Bazel 6.0
https://github.com/bazelbuild/bazel/commits/9cb5e0a31665d3b3f25bf58ec2dee696e828d8b9

What isn’t obvious if we should to start calling fsync in our Bazel rules prior to closing as well, or there is some other race condition at play here that led to this being cherry picked cc @artem-zinnatullin

Filing this for posterity or if anyone else has a suggestion

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 21 (21 by maintainers)

Most upvoted comments

small update on my report https://github.com/bazelbuild/bazel/issues/17428#issuecomment-1423338695 we observed file corruption issue on MacOS 12.3 MacStudio CI builders but upgrading to MacOS 13.4 “fixed” the problem, we have not see more corruptions for couple months which implies software bug on Apple side (at least in our specific case)