tensorflow: Build with CUDA support fails with GCC >= 10.3

System information

  • OS Platform and Distribution: Ubuntu Linux 21.04
  • TensorFlow installed from (source or binary): source
  • TensorFlow version: v2.5.0-rc2
  • Python version: 3.9
  • Bazel version (if compiling from source): 3.7.2
  • GCC/Compiler version (if compiling from source): 10.3
  • CUDA/cuDNN version: 11.2 / 8.2

Describe the problem

Building tensorflow with CUDA support with GCC 10.3 fails with the following error:

/usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
  428 |  _S_gcd(intmax_t __m, intmax_t __n) noexcept
      |                           ^~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-10/README.Bugs> for instructions.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.

Apparently, this is a regression starting with GCC 10.3 (default compiler on Ubuntu 21.04) when using gcc in conjunction with nvcc. Here is the upstream bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

Installing and using gcc-9 as NVCC host compiler in configure still works.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (5 by maintainers)

Commits related to this issue

Most upvoted comments

@pwuertz Could you please try to install from source using latest version of TF 2.6.0 and let us know if it helps? Thank you!

Got a successful build with the following environment:

  • Tensorflow 2.7.0
  • CUDA 11.3
  • GCC 10.3.0 as NVCC host compiler (GCC 11 not supported by CUDA 11.3)
  • Ubuntu 21.10

@sanjoy Yes, probably a pure GCC issue. No suggestions on how to handle this on the Tensorflow end other than monitoring what’s happening upstream. A warning emitted by the Tensorflow build for known-bad compiler versions would be nice, but I don’t know how much work this is. Could be worthwhile though since there is no telling when we’ll get a fix in GCC and at which point that patch is applied in linux-distribution-of-your-choice (if at all).