tensorflow: Sym-links in nvcc C compiler path causes "undeclared inclusion(s) in rule '//tensorflow/core:lib_hash_crc32c_accelerate_internal'"
Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Redhat Enterprise Linux 7.x
- TensorFlow installed from (source or binary): source
- TensorFlow version: 1.14.0
- Python version: 3.7.4
- Installed using virtualenv? pip? conda?: virtualenv
- Bazel version (if compiling from source): 0.24.1
- GCC/Compiler version (if compiling from source): 4.9.2
- CUDA/cuDNN version: 10.0.130/7.4.2.24
- GPU model and memory: Tesla P100
Describe the problem
We have compilers installed in /shared/ucl/apps/gcc/4.9.2. Unfortunately /shared is a sym-link to /lustre/shared. Attempting to build Tensorflow with Cuda support results in this sym-link being inconsistently de-referenced meaning that some rules refer to the /shared location while the system include paths refer to /lustre/shared. This seems to be an issue only with the C compiler used by nvcc, not with the one used to build the rest of the code.
Telling the configure script to use /lustre/shared/ucl/apps/gcc/4.9.2/bin/gcc as the nvcc c compiler works around the problem but is less than ideal as this path is different on different clusters. It culd also be an issue if a user on a mult-user system uses their own install of GCC in /homewhere /home is a sym-link. This appears to be a bug in the way the configure script and/or bazel deals with sym-links.
Provide the exact sequence of commands / steps that you executed before running into the problem
The script at https://github.com/owainkenwayucl/install-scripts/blob/master/scripts/tensorflow/tensorflow-1.14.0-py37-gpu_install
But effectively:
CONFIG_OPTS="--config=cuda --copt=-march=broadwell --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0 --copt=-O3"
export TF_CUDA_PATHS=/shared/ucl/apps/cuda/10.0.130/gnu-4.9.2,/shared/ucl/apps/cudnn/7.4.2.24/10.0/cuda
./configure
bazel build --verbose_failures $CONFIG_OPTS //tensorflow/tools/pip_package:build_pip_package
Answering “yes” to build with cuda and selecting defaults for compute capability, nccl etc.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 20 (11 by maintainers)
Commits related to this issue
- Patch to solve sym-link problem https://github.com/tensorflow/tensorflow/issues/32776 — committed to owainkenwayucl/install-scripts by owainkenwayucl 5 years ago
Having looked at
configure.pyI believe it’s as simple as adding:at line 842 in the current version, and 831 in 1.14.0.
I’m happy to do this and do a pull request if you prefer?
Do you think it is a different issue? If so, I’ll create another issue.
Build failure: