tensorflow: Error on compiling from source

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux archlinux 5.3.11-arch1-1 x86_64 GNU/Linux

  • TensorFlow installed from (source or binary): source

  • TensorFlow version: commit hash: 872b1ab23f0aac182d5b2051f45d5d003963bfe3

  • Python version: 3.7.5

  • Installed using virtualenv? pip? conda?: conda

  • Bazel version (if compiling from source): bazel 0.29.1- (@non-git)

  • GCC/Compiler version (if compiling from source): gcc (GCC) 9.2.0

  • CUDA/cuDNN/TensorRT version: 10.1.243-2/7.6.4.38-1/6.0.1.5-1

  • GPU model and memory: nVidia RTX 2080 8GB

Describe the problem Compiling with bazel fails: ERROR: /home/jaaq/.cache/bazel/_bazel_jaaq/c463894dd2648fc5b64eeed02cc022b5/external/grpc/BUILD:507:1: C++ compilation of rule ‘@grpc//:gpr_base’ failed (Exit 1)

Provide the exact sequence of commands / steps that you executed before running into the problem

git clone repo
cd tensorflow
source /opt/anaconda/bin/activate
conda activate python375env
./configure (Yes on XLA JIT, CUDA, TensorRT, clang)
bazel build //tensorflow/tools/pip_package:build_pip_package

Any other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. ERROR: /home/jaaq/.cache/bazel/_bazel_jaaq/c463894dd2648fc5b64eeed02cc022b5/external/grpc/BUILD:507:1: C++ compilation of rule ‘@grpc//:gpr_base’ failed (Exit 1) external/grpc/src/core/lib/gpr/log_linux.cc:43:13: error: ambiguating new declaration of ‘long int gettid()’ static long gettid(void) { return syscall(__NR_gettid); } ^~~~~~ In file included from /usr/include/unistd.h:1170, from external/grpc/src/core/lib/gpr/log_linux.cc:41: /usr/include/bits/unistd_ext.h:34:16: note: old declaration ‘__pid_t gettid()’ extern __pid_t gettid (void) __THROW; ^~~~~~ external/grpc/src/core/lib/gpr/log_linux.cc:43:13: warning: ‘long int gettid()’ defined but not used [-Wunused-function] static long gettid(void) { return syscall(__NR_gettid); } ^~~~~~ Target //tensorflow/tools/pip_package:build_pip_package failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 89.815s, Critical Path: 22.19s INFO: 1215 processes: 1215 local. FAILED: Build did NOT complete successfully

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 5
  • Comments: 15 (1 by maintainers)

Most upvoted comments

The grpc.patch should be applied in tensorflow/bazel-tensorflow/external/grpc/src after the bazel build started the download.

Or apply it in the .cache (P.S. not working if applied in the cache)

cd ~/.cache/bazel/*/*/external/grpc
patch -p1 < ~/grpc.patch

I’m with @IceCryptonym: clearer directions would be greatly appreciated!

Perhaps the source has changed, but I don’t see any linked patch file or even much mention of patching at the page described by @shantanu-gontia (https://gist.github.com/kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03). Nor does the Arch-Linux repo mentioned by @Mithrandir2k18 seem to make any mention of gprc, rather it’s all about mkl.

But I followed @shantanu-gontia’s instructions, and have posted a patch file here: https://gist.github.com/drscotthawley/8eb51af1b4c92c4f18432cb045698af7

It can be applied by going to the main grpc directory and running

$ git apply grpc.patch

Whats still not clear is where this should be applied, i.e. where does Bazel put grpc and how I can apply the patch after Bazel puts it there? I see tensorflow/third_party/grpc/ but it’s empty except for a zero-length file called BUILD. There’s also tensorflow/tensorflow/contrib/cmake/patches/grpc/, but that only contains the file rand.h Finally, just putting the patch file in third_party/ doesn’t seem to cause the patch to be applied.

EDIT: Seems Bazel puts it in ~/.cache/bazel/_bazel_($USER)/(big_long_random_directory_name)/external/grpc, but this is not a git directory so git apply won’t work. So…still unclear on how to apply the patch reliably and automatically within the Bazel build.

I have ran into this issue as well and I am not sure how to apply the patch. Would someone be able to give me some incite?

same here

ERROR: /home/user/.cache/bazel/_bazel_userbest/89310458958e3aedfc736fc503a62f52/external/grpc/BUILD:507:1: C++ compilation of rule '@grpc//:gpr_base' failed (Exit 1)
external/grpc/src/core/lib/gpr/log_linux.cc:43:13: error: ambiguating new declaration of 'long int gettid()'
   43 | static long gettid(void) { return syscall(__NR_gettid); }
      |             ^~~~~~
In file included from /usr/include/unistd.h:1170,
                 from external/grpc/src/core/lib/gpr/log_linux.cc:41:
/usr/include/bits/unistd_ext.h:34:16: note: old declaration '__pid_t gettid()'
   34 | extern __pid_t gettid (void) __THROW;
      |                ^~~~~~
external/grpc/src/core/lib/gpr/log_linux.cc:43:13: warning: 'long int gettid()' defined but not used [-Wunused-function]
   43 | static long gettid(void) { return syscall(__NR_gettid); }
      |             ^~~~~~
ERROR: /home/user/.cache/yay/tensorflow-git/src/tensorflow/tensorflow/python/tools/BUILD:312:1 C++ compilation of rule '@grpc//:gpr_base' failed (Exit 1)
INFO: Elapsed time: 1012.293s, Critical Path: 30.58s
INFO: 3168 processes: 3168 local.
FAILED: Build did NOT complete successfully
==> ERROR: A failure occurred in build().
    Aborting...
Error making: tensorflow-git

Arch Linux

The error seems to be due to the same symbol definitions (of gettid) introduced in GCC 9.0+ extant in grpc. https://github.com/grpc/grpc/issues/20043

The folks at grpc have patched the problem in newer releases. https://github.com/grpc/grpc/pull/20048

However, during the build process, TensorFlow downloads a cached version of an older commit of grpc from the TensorFlow mirror. At this point, I do not know which commit from the grpc repository fixes the issue in particular (I tried the latest one, it has some problem with upb_proto_library). Therefore, the best course of action, as I found from other sources (https://gist.github.com/kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03) is to patch the grpc commit downloaded using a patch file.

If the patch in https://gist.github.com/kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03 doesn’t work I would suggest cloning the grpc source from github, checkout the commit 4566c2a29ebec0835643b972eb99f4306c4234a3. Edit the following files -

  • src/core/lib/gpr/log_linux.cc
  • src/core/lib/gpr/log_posix.cc
  • src/core/lib/iomgr/ev_epollex_linux.cc

Just change every instance of the term gettid to sys_gettid in these three files and generate a patch for yourself using git diff and use that patch.