bazel: --action_env ignored when `cfg = "exec"` is used
Description of the bug:
We need to pass some environment variables like “$CPATH
” to the compiler when building TensorFlow with Bazel. This is cumbersome itself and has led to hard-to-debug issues like https://github.com/bazelbuild/bazel/issues/12059 in the past already.
Now we again see failures caused by action-env values not passed to the compiler invocation in TensorFlow 2.8.4 which I tracked down to https://github.com/tensorflow/tensorflow/commit/07cbc7bb0bf899aac2bee5e21e1ba4eb40038682 which changes cfg = "host"
to cfg = "exec"
What’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Build TensorFlow 2.8.4 with Bazel 4.2.2 passing --action-env=CPATH
and observe that it is not passed to some compiler invocations resulting in e.g.:
In file included from bazel-out/k8-opt-exec-50AE0418/bin/tensorflow/core/framework/dataset_options.pb.cc:4: bazel-out/k8-opt-exec-50AE0418/bin/tensorflow/core/framework/dataset_options.pb.h:10:10: fatal error: google/protobuf/port_def.inc: No such file or directory 10 | #include <google/protobuf/port_def.inc>
So can you provide information on how to use --action-env
(or similar) in such circumstances?
An explanation on what is actually being done with the change to “exec” from “host” would also be very welcome. In our case we are not cross-compiling so host, target and build machine are all the same.
I would clearly classify this behavior as a bug because the documentation states:
Specifies the set of environment variables available during the execution of all actions.
But obviously there are now actions where those are missing but they should be in “all actions”
Which operating system are you running Bazel on?
REHL 7
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 24 (17 by maintainers)
Seems my comment didn’t got saved so again the most important bits as multiple people spent days trying to figure out why TF started to fail to compile due to the wrong/unset env variables (I finally found the commit that broke/changed it by bisecting over 2000 commits):
We are in, what should be, the most simple situation: Build a software (TensorFlow) with Bazel on the machine it is supposed to run on. On other build systems it is a variation of
configure && make && make install
but for Bazel which by default deletes all environment variables we need to pass commandline options to have them set (this is about e.g.$CPATH
and$LIBRARY_PATH
)Previously
--action_env
has mostly worked as it was documented as “environment variables available during the execution of all actions”. Above I linked the documentation of Bazel 4.2.2 which is the version of Bazel I’ve been using. @fmeum Mentioned that this is wrong. As behavior of Bazel changes considerably between versions it would be good to have correct documentation for each specific version and I would count this at least as a bug report against the 4.2.2 documentation if the observed behavior is intended and hence the new documentation (which doesn’t seem to be specific to a version?) is correct.However even before the change in TF it was brittle as the used environment depended on many other factors, for example:
tools
vsexec_tools
: https://github.com/tensorflow/tensorflow/pull/44901use_default_shell_env
: https://github.com/tensorflow/tensorflow/pull/44549cfg
attrI don’t fully understand the difference between “host” and “exec” configurations, they sound very similar. Also the observed behavior surprises me:
--action_env
applies to “target” configurations per the new docs but reverting the TF commit (so changingcfg = "exec"
back tocfg = "host"
) makes the environment variables be passed. But withcfg = "exec"
I need to set them via--host_action_env
. Why is that? The last part makes sense per the new docs but why--action_env
does apply to “host” cfgs isn’t clear to me. Has this changed? Or is this the expected effect of--distinct_host_configuration=false
? In which Bazel version will this be a no-op?Next question is why
use_default_shell_env
isn’t set by default? The name implies that.And finally I’d like to ask what the “correct” way would be to build something with Bazel on the machine it is to be run i.e. a local, **non-**crosscompilation build.
Things that come to mind:
--action_env
applying to all actions and introduce a separate--target_action_env
similar to--host_action_env
--no-distinct-configurations
so that “host”, “target” and “exec” are all the same avoiding potential rebuilds and issues like we have seen with unexpected changes to action environmentsI would also greatly appreciate an easy way to find out why a specific C++ file is compiled (or library linked) with specific env variables (i.e. in a specific configuration). E.g. it would help if
--subcommands
would show the configuration name of an action and there would be a way to query how & why a specific file is built. It was very confusing that a library was build with missing env variables but callingbazel build
on the seemingly obvious target had the env variables and hence worked. Turned out that the library was build again as part of a dependency of a dependency of a tool which had somewherecfg = "exec"
. But I haven’t found a way in the documentation to find that from a source tree and/or with bazel. So the only way I found feasible is find the commit which changed the behavior and check those changes.I hope that helps and Merry Christmas! 🎅
I’m quite confused by that sequence of events. #4008 is an issue that happens because a lot of rules in the ecosystem don’t support
--action_env
because they’re implemented withrun_shell
, which by default doesn’t use the action env (use_default_shell_env = False
). In my mind the fix for that would be forrun_shell
to use the action env by default, not to introduce a--host_action_env
.I don’t think we should make
--action_env
apply to the exec configuration by default. I think similar to this issue https://github.com/bazelbuild/bazel/issues/13839 the inability to use flags that only target the target configuration causes issues.