bazel: Bazel remote-caching does not properly treat volatile-status as if it never changes

Description of the problem / feature request:

When a target depends on the file volatile-status.txt as an input, that target will not be rebuilt if volatile-status.txt changes and the target is cached locally. However, the remote cache does not behave in the same way, and a change in the hash of volatile-status.txt will cause the target not to read from the remote cache.

Bugs: what’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Clone the repo test_volatile_status and set up a bazel remote cache. Then run the following commands at the root of the repo:

bazel build --remote_cache=<your_remote_cache> //:artifact
bazel build --remote_cache=<your_remote_cache> //:artifact 
# the above will read from the local cache, as expected

bazel clean
bazel build --remote_cache=<your_remote_cache> //:artifact
# this will not read from the remote cache

What operating system are you running Bazel on?

Ubuntu 16.04.6 LTS

What’s the output of bazel info release?

release 0.29.0

What’s the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

https://github.com/Panfilwk/test_volatile_status.git da002118b6c79bd7e33b017221c772fbfba4f009 da002118b6c79bd7e33b017221c772fbfba4f009

Have you found anything relevant by searching the web?

This issue touches on this concept for local caching, but nothing about how the behavior of this file works with remote caching.

Any other information, logs, or outputs that you want to share?

exec1.log.txt and exec2.log.txt are the resulting parsed experimental execution logs from the first and third build commands following the debug instructions here

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 29 (21 by maintainers)

Commits related to this issue

Most upvoted comments

We are also facing this issue, which impact the efficiency of our continuous delivery and load and fill, unnecessarily, our remote cache. I have just spent a little time to find the origin of the problem, and if it can help, here a minimal example to reproduce the behaviour.

Example

WORKSPACE

BUILD.bazel

load(":myrule.bzl", "myrule")

myrule(name = "dummy")

myrule.bzl

def _myrule_impl(ctx):
    output_file = ctx.actions.declare_file(ctx.attr.name)

    ctx.actions.run_shell(
        outputs = [output_file],
        inputs = [ctx.version_file],
        command = "cp -f $1 $2",
        arguments = [
            ctx.version_file.path,
            output_file.path,
        ],
    )

    return DefaultInfo(
        files = depset([output_file]),
    )

myrule = rule(
    implementation = _myrule_impl,
)

A simple rule which copy the file volatile-status.txt to bazel-bin/dummy. Exactly the same example than test_volatile_status.

Scenarios

First build

We start from an empty environment. The Bazel local cache and the remote cache are fully cleaned.

bazel build \
    --stamp \
    --remote_cache=grpc://localhost:9092 \
    --experimental_remote_grpc_log=/tmp/grpc.log \
    //:dummy

bazel-bin/dummy:

BUILD_TIMESTAMP 1654449958

bazel-out/volatile-status.txt:

BUILD_TIMESTAMP 1654449958

As expected, the file, not present in the local cache and in the remote cache, is built and the remote cache is filled. We can see all executed gRPC functions from the log file:

[05 Jun 2022 19:25:58.229] build.bazel.remote.execution.v2.ActionCache::GetActionResult - NotFound (2ms)
    Message: 1e8a3a53556eb21863ddd36e4e6dc6b4b40e1e5308c489130334dca96e77f399 not found in AC
    -->
    	|- ActionDigest: 1e8a3a53556eb21863ddd36e4e6dc6b4b40e1e5308c489130334dca96e77f399/143
    <--

[05 Jun 2022 19:25:58.245] build.bazel.remote.execution.v2.ContentAddressableStorage::FindMissingBlobs - OK (2ms)
    -->
    	|- BlobDigests
    	   - 405ee2251a1467910dd9163e6991d0a6df935920484e9759e6ecc728b980bd7b/27
    	   - e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855/0
    	   - d543d2b229eec2c9cfff4ef2a4795f50d53944628193efb2b27cec40da964ddb/131
    	   - 1e8a3a53556eb21863ddd36e4e6dc6b4b40e1e5308c489130334dca96e77f399/143
    <--
    	|- MissingBlobDigests
    	   - 405ee2251a1467910dd9163e6991d0a6df935920484e9759e6ecc728b980bd7b/27
    	   - d543d2b229eec2c9cfff4ef2a4795f50d53944628193efb2b27cec40da964ddb/131
    	   - 1e8a3a53556eb21863ddd36e4e6dc6b4b40e1e5308c489130334dca96e77f399/143

[05 Jun 2022 19:25:58.253] google.bytestream.ByteStream::Write - OK (20ms)
    -->
    	|- ResourceNames: [uploads/935da4f7-4095-4492-a43b-7bab3b2b147d/blobs/1e8a3a53556eb21863ddd36e4e6dc6b4b40e1e5308c489130334dca96e77f399/143]

[05 Jun 2022 19:25:58.252] google.bytestream.ByteStream::Write - OK (22ms)
    -->
    	|- ResourceNames: [uploads/25a22748-a5e1-4294-a15d-03766431a14a/blobs/405ee2251a1467910dd9163e6991d0a6df935920484e9759e6ecc728b980bd7b/27]

[05 Jun 2022 19:25:58.253] google.bytestream.ByteStream::Write - OK (28ms)
    -->
    	|- ResourceNames: [uploads/26a4e52c-eac7-4839-a0b7-abb5994bbc38/blobs/d543d2b229eec2c9cfff4ef2a4795f50d53944628193efb2b27cec40da964ddb/131]

[05 Jun 2022 19:25:58.283] build.bazel.remote.execution.v2.ActionCache::UpdateActionResult - OK (10ms)
    -->
    	|- ActionDigest: 1e8a3a53556eb21863ddd36e4e6dc6b4b40e1e5308c489130334dca96e77f399/143
    	|- ActionResult:
    		|- OutputFiles:
    		|-   - x bazel-out/k8-fastbuild/bin/dummy
    		|-     |- 405ee2251a1467910dd9163e6991d0a6df935920484e9759e6ecc728b980bd7b/27
    <--

Second build without clean cache

bazel build \
    --stamp \
    --remote_cache=grpc://localhost:9092 \
    --experimental_remote_grpc_log=/tmp/grpc.log \
    //:dummy

bazel-bin/dummy:

BUILD_TIMESTAMP 1654449958

bazel-out/volatile-status.txt:

BUILD_TIMESTAMP 1654450956

Not surprisingly, the file, already in the local cache is not rebuilt even if the volatile stamp file has changed. No gRPC requests are executed. This behaviour is conform with the documentation.

Third build with a clean of the local cache

bazel clean --expunge
bazel build \
    --stamp \
    --remote_cache=grpc://localhost:9092 \
    --experimental_remote_grpc_log=/tmp/grpc.log \
    //:dummy

bazel-bin/dummy:

BUILD_TIMESTAMP 1654451141

bazel-out/volatile-status.txt:

BUILD_TIMESTAMP 1654451141

The file, with a different digestKey than the first build, is searched from the remote cache and is not found. The file is then rebuilt and pushed again in the remote cache. We can see all similary executed gRPC requests than the first build.

The computation of this digestKey should not take into account the updated content of the volatile-status.txt, as is the case when we remove or manually update the volatile stamp file between two builds.

It’s the definition of the volatile file:

… In order to avoid rerunning stamped actions all the time though, Bazel pretends that the volatile file never changes. In other words, if the volatile status file is the only file whose contents has changed, Bazel will not invalidate actions that depend on it. If other inputs of the actions have changed, then Bazel reruns that action, and the action will see the updated volatile status, but just the volatile status changing alone will not invalidate the action.

(from the user-manual 5.1.1)

This behaviour should be the same even if we use a remote cache.

Workaround

To avoid this issue, we are going to remove the use of the --stamp option and:

  • Find a dirty and not hermetic way to inject our volatile values. or
  • Use an another tool or another build system to finish the build of our final artifacts.

If we can reopen this issue that would be really great and appreciated ❤️ !

For all who are still struggling with this issue, consider just making your volatile status files constant through your workspace status command. You can use an environment variable to control this behavior, which can be set or unset depending on whether you are testing or deploying. If the status files themselves are constant, you don’t have to worry about how they are handled by bazel and any language rules.

You can override bazel’s default values for status vars as in this example here.

I submitted some PR / issues to various rules (k8s, docker, etc.) that are affected by this to disable volatile status stamping but feels wrong to me. This really should be addressed here. I understand the concerns about correctness higher in the thread, but Bazel claims both Fast and Correct. This issue seriously impairs “Fast.”

Not sure I understand something. Wouldn’t the stamped output files be stored in the remote cache, and couldn’t bazel just ignore volatile-status.txt and just download the previously built output file?