bazel: TensorFlow is broken with Bazel@HEAD
(03:29:23) ERROR: /var/lib/buildkite-agent/builds/bk-docker-062w/bazel-downstream-projects/tensorflow/tensorflow/BUILD:1409:19: Executing genrule //tensorflow:tf_python_api_gen_v2 failed: (Aborted): bash failed: error executing command (from target //tensorflow:tf_python_api_gen_v2)
(cd /var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/18887e87f83f3c86081828342b40a00a/execroot/org_tensorflow && \
exec env - \
PATH=/var/lib/buildkite-agent/.cache/bazelisk/local/-tmp-tmppfqvvf95-bazel/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
PYTHON_BIN_PATH=/usr/bin/python3 \
PYTHON_LIB_PATH=/usr/lib/python3/dist-packages \
TF2_BEHAVIOR=1 \
/bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/k8-opt-exec-50AE0418/bin/tensorflow/create_tensorflow.python_api_tf_python_api_gen_v2 --root_init_template=tensorflow/api_template.__init__.py --apidir=bazel-out/k8-opt/bin/tensorflow_api/v2/ --apiname=tensorflow --apiversion=2 --compat_apiversion=1 --compat_apiversion=2 --
...
...
Culprit finder: https://buildkite.com/bazel/culprit-finder/builds/3607#01834b10-64ec-4e64-812f-2357b4ccd39a Bisect shows dadc49e437018f482640ed76fae5307daf9911a8 is the culprit
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 40 (39 by maintainers)
Commits related to this issue
- Automated rollback of commit 21904a92b800139499f2f62c7c4c11351fa8bc0c. *** Reason for rollback *** Due to https://github.com/bazelbuild/bazel/issues/16296#issuecomment-1273507703 *** Original chang... — committed to bazelbuild/bazel by meteorcloudy 2 years ago
- Automated rollback of commit dadc49e437018f482640ed76fae5307daf9911a8. *** Reason for rollback *** Breaks Tensorflow. See https://github.com/bazelbuild/bazel/issues/16296 *** Original change descri... — committed to bazelbuild/bazel by c-parsons 2 years ago
- Automated rollback of commit 21904a92b800139499f2f62c7c4c11351fa8bc0c. *** Reason for rollback *** Due to https://github.com/bazelbuild/bazel/issues/16296#issuecomment-1273507703 *** Original chang... — committed to aiuto/bazel by meteorcloudy 2 years ago
- Automated rollback of commit dadc49e437018f482640ed76fae5307daf9911a8. *** Reason for rollback *** Breaks Tensorflow. See https://github.com/bazelbuild/bazel/issues/16296 *** Original change descri... — committed to aiuto/bazel by c-parsons 2 years ago
- Automated rollback of commit 21904a92b800139499f2f62c7c4c11351fa8bc0c. *** Reason for rollback *** Due to https://github.com/bazelbuild/bazel/issues/16296#issuecomment-1273507703 *** Original chang... — committed to bazelbuild/bazel by meteorcloudy 2 years ago
- Automated rollback of commit dadc49e437018f482640ed76fae5307daf9911a8. *** Reason for rollback *** Breaks Tensorflow. See https://github.com/bazelbuild/bazel/issues/16296 *** Original change descri... — committed to bazelbuild/bazel by c-parsons 2 years ago
- Rename attribute to "deps" This hack allows us to support native cc_shared_library aspect propagation along the "deps" attribute. See https://github.com/bazelbuild/bazel/issues/16296#issuecomment-12... — committed to msft-mirror-aosp/platform.build.bazel by c-parsons 2 years ago
- Allow cc_shared_library aspect to propagate along all attributes This is rolling forward https://github.com/bazelbuild/bazel/commit/ff927f72644b4d1c3036ea8b3b821c58c34c92fa which I mistakenly confuse... — committed to bazelbuild/bazel by oquenchil a year ago
- Allow cc_shared_library aspect to propagate along all attributes This is rolling forward https://github.com/bazelbuild/bazel/commit/ff927f72644b4d1c3036ea8b3b821c58c34c92fa which I mistakenly confuse... — committed to bazelbuild/bazel by oquenchil a year ago
- Add aspect hints to cc_shared_library One reason why cc_shared_library wasn't working for upb_proto_library was because indirect cc_shared_library.deps (i.e. linker_inputs not provided by the immedi... — committed to bazelbuild/bazel by oquenchil a year ago
- Add aspect hints to cc_shared_library One reason why cc_shared_library wasn't working for upb_proto_library was because indirect cc_shared_library.deps (i.e. linker_inputs not provided by the immedi... — committed to fweikert/bazel by oquenchil a year ago
@c-parsons @meteorcloudy I created this commit over half a year ago, feel free to change the logic in any way you see fit - including rolling it back. It only affects the non-Starlark implementation anyway.
cc_shared_library built-in code defines and exposes the provider PropagationHints. Your custom rule can instantiate this provider which the graph structure aspect will be able to see.
In the PropagationHints instance that you create, you place in a list field the attributes of your custom rule through which you want propagation to happen. I think this separates responsibilities cleanly. For example, if you have a new attribute you want to add to the list, you wouldn’t have to wait for a change to Bazel, you can just change what you put inside the PropagationHints provider.
I didn’t test the code but that’s the idea.
Looking at this a bit more I can imagine why you sent the change in the first place. First just to document what the problem is, let me rephrase the problem that Chris found during debugging.
The cc_shared_library implementation decides what to link statically based on the structure of the build graph, it uses the aspect to figure out that structure. It tries to link statically every library in the transitive closure of the direct deps of the shared library (i.e. roots) that is reachable without going through dynamic libraries. That’s why we need the structure of the graph.
Chris’ original change is correct and wrong at the same time. It makes libraries be linked statically that should be linked statically but weren’t linked before (because his project has custom C++ rules with different attribute names which is a valid use case) but it also links libraries statically that shouldn’t be linked statically.
The proposed fix with hard-coding the attribute names “srcs” and “hdrs” cuts the chain that builds the structure of the graph that propagates via genrules, filegroups, etc… I think that chain can be cut earlier by using required_providers on the aspect definition. This will skip the genrule and filegroup altogether, it also forces the custom C++ rules to advertise the CcInfo provider, so Chris you might want to check that this works for your project too, not just for Tensorflow.
If it works, I think the whole line
if fieldname == "_cc_toolchain" or fieldname == "target_compatible_with" or fieldname == "srcs" or fieldname == "hdrs":
can be removed, including the field names that were already there:_cc_toolchain
andtarget_compatible_with
.Note: There is still one edge case that a precompiled library coming from a filegroup or generated by a genrule isn’t linked statically when it should, but if any project in the future bumped into that, they should wrap that target in a cc_import which would be the principled thing to do. This shouldn’t happen with TF now since by using the old code they were definitely not relying on any filegroup or genrule being linked that anyway.
Thanks a bunch! Have reproed the error successfully.
Still don’t have any leads on the underlying failure – it may take me a bit.
I can confirm I can reproduce this issue at https://github.com/tensorflow/tensorflow/commit/007f31c872c9c74f1298b7389d324e3947ede5ce with Bazel at dadc49e437018f482640ed76fae5307daf9911a8, but not at the parent commit of https://github.com/tensorflow/tensorflow/commit/007f31c872c9c74f1298b7389d324e3947ede5ce