bazel: [6.0] accelerated file corruption on macos with Bazel 6.0
Description of the bug:
After updating to Bazel 6.0 I have observed significant disk cache corruption on macOS. A handful of previously working rule outputs started show up in the disk_cache often with size 0 or in a rare case was partially written. The rules work fine without a disk_cache
and on Bazel 6.0 are working fine under load.
There were no other changes to the rule in question. Further, the rules are scoped to no-remote
.
I wrote accelerated in the title because after further investigation I have reason to believe it might have happened on 5.3.2 to the same rules.
What’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No response
Which operating system are you running Bazel on?
macOS 12.5.2
What is the output of bazel info release
?
6.0.0
If bazel info release
returns development version
or (@non-git)
, tell us how you built Bazel.
No response
What’s the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD
?
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
Bazel 6.0 potentially related changes
I noticed that this change was added in Bazel 6.0
https://github.com/bazelbuild/bazel/commits/9cb5e0a31665d3b3f25bf58ec2dee696e828d8b9
What isn’t obvious if we should to start calling fsync
in our Bazel rules prior to closing as well, or there is some other race condition at play here that led to this being cherry picked cc @artem-zinnatullin
Filing this for posterity or if anyone else has a suggestion
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 21 (21 by maintainers)
small update on my report https://github.com/bazelbuild/bazel/issues/17428#issuecomment-1423338695 we observed file corruption issue on MacOS 12.3 MacStudio CI builders but upgrading to MacOS 13.4 “fixed” the problem, we have not see more corruptions for couple months which implies software bug on Apple side (at least in our specific case)