bazel: grpc Mac bazel build fails because of thread-local, but make passes

Description of the problem / feature request / question:

I’m on the grpc/grpc project. We recently switched to using thread-local on one of our key structs. Although this project builds fine with make on Mac and Linux, it stopped building with bazel on Mac after we made this switch. Looking into it, it seems like tensorflow/serving also has the same issue.

If possible, provide a minimal example to reproduce the problem:

$ git clone git@github.com:grpc/grpc
$ cd grpc
$ bazel build //:grpc

Environment info

  • Operating System: MacOS Sierra 10.12.6
  • Bazel version (output of bazel info release): release 0.8.1-homebrew

Have you found anything relevant by searching the web?

Related issues are tensorflow/serving#1 and grpc/grpc#13856

Anything else, information or logs or outputs that would be helpful?

$ bazel build --verbose_failures //:grpc
INFO: Analysed target //:grpc (0 packages loaded).
INFO: Found 1 target...
ERROR: /Users/vpai/Git/grpc/BUILD:224:1: Linking of rule '//:grpc' failed (Exit 1): cc_wrapper.sh failed: error executing command 
  (cd /private/var/tmp/_bazel_vpai/5285458b308b3aadd65cb54a5ac76b0c/execroot/com_github_grpc_grpc && \
  exec env - \
    APPLE_SDK_PLATFORM=MacOSX \
    APPLE_SDK_VERSION_OVERRIDE=10.13 \
    PATH=/Users/vpai/google-cloud-sdk/bin:/usr/local/git/current/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/go/bin \
    TMPDIR=/var/folders/xd/2k15ssh10lz6088_k_2lddtw007xtq/T/ \
    XCODE_VERSION_OVERRIDE=9.1.0 \
  external/local_config_cc/cc_wrapper.sh -fobjc-link-runtime -Wl,-S -shared -o bazel-out/darwin-fastbuild/bin/libgrpc.so bazel-out/darwin-fastbuild/bin/_objs/grpc/src/core/lib/surface/init.o bazel-out/darwin-fastbuild/bin/_objs/grpc/src/core/plugin_registry/grpc_plugin_registry.o -pthread -headerpad_max_install_names -lc++ -no-canonical-prefixes -undefined dynamic_lookup)

Use --sandbox_debug to see verbose messages from the sandbox
clang: warning: argument unused during compilation: '-pthread' [-Wunused-command-line-argument]
ld: illegal thread local variable reference to regular symbol __ZN9grpc_core7ExecCtx9exec_ctx_E for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Target //:grpc failed to build
INFO: Elapsed time: 0.504s, Critical Path: 0.31s
FAILED: Build did NOT complete successfully

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 35 (10 by maintainers)

Commits related to this issue

Most upvoted comments

rules_rust

At first glance I’d’ve said that the BUILD file is simply broken: https://github.com/bazelbuild/rules_rust/blob/8c388e1b816d0a7e5a7d3cc5d213be7f35299cf5/examples/ffi/rust_calling_c/c/BUILD#L23

A cc_library named “native_matrix_so” implicitly generates an output named “libnative_matrix_so.so” (along with “libnative_natrix_so.a”), yet, this build file declares a separate cc_binary rule which has the same name. Thus, the output name conflicts. So, the failure looks expected. Only… apparently it was working before. That it was working seems unexpected to me!

Ahh…so, here’s the Bazel code which registers the failure outputs: https://github.com/bazelbuild/bazel/blob/308bce36cba46095fe41866e703710035ddddada/src/main/java/com/google/devtools/build/lib/rules/cpp/CcLibrary.java#L254

That code generates a dummy “error” output, when !supportsDynamicLinker. But, in the case where it is supported, it only generates a lib*.so output when srcs is not empty. So…that explains things here.

This is unfortunate – and perhaps could be improved in Bazel, so that the behavior is more consistent. But, this also seems like a bug in the example BUILD file to have written rules with conflicting (or potentially-conflicting) output names, and can be easily solved by renaming one of those two rules.

rules_go

This is a different issue. Looking at the rules_go build file: https://github.com/bazelbuild/rules_go/blob/5e733237761fbe70d10afd3156e73355392a66b2/tests/legacy/examples/cgo/cc_dependency/BUILD.bazel#L22

For some reason, this seems to go to extra effort to do something broken – it’s explicitly using the nodeps shared-library output from cc_library (via the filegroup asking for output_group = "dynamic_library",). This is odd, and I don’t know any good reason why it’d want to do that. I imagine someone was just confused and wrote this accidentally?

The typical way to do this – and what this example ought to be doing I think – is to use a cc_binary instead of the cc_library/filegroup pair. Something like:

cc_binary(
    name = "c_version_so",
    srcs = ["c_version.c", "c_version.h"],
    linkshared=True,
)

@trybka It looks like your fix (https://github.com/bazelbuild/bazel/commit/ec5553352f2f661d39ac4cf665dd9b3c779e614c) is causing breakages in rules_go and rules_rust on macOS. Could you take a look?

Ok, when I tried master, I tried it a week ago, when the fix was not yet landed. I can confirm that --copt and --host_copt are only needed for versions prior to the fix. Current master builds successfully without any additions to bazelrc on mac both in darwin_fastbuild and host.

Looks like defining --copt=-DGRPC_BAZEL_BUILD in our bazel.rc file is not ideal as the users don’t inherit this setting (and they end up with wrong type of thread locals being used on mac).