tensorflow: Fatal error while compiling Tensorflow with CUDA 9.1

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-104-generic x86_64)
  • TensorFlow installed from (source or binary): Source
  • TensorFlow version (use command below): unknown 1.4.0 (Source code is cloned from 798fa36d11119e6fdc13b90a14abfe1805e7de90)
  • Python version: 3.6.3
  • Bazel version (if compiling from source): 0.8.1
  • GCC/Compiler version (if compiling from source): gcc version 5.4.0 20160609
  • CUDA/cuDNN version: CUDA 9.1 cuDNN 7.0.5
  • GPU model and memory: 2 * Tesla V100-PCIE-16GB
  • Exact command to reproduce: See description below.

Describe the problem

While trying to compile the latest TensorFlow(cloned from 798fa36d11119e6fdc13b90a14abfe1805e7de90), such error will be raised:

ERROR: /home/ubuntu/tensorflow/tensorflow/contrib/seq2seq/BUILD:64:1: error while parsing .d file: /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/execroot/org_tensorflow/bazel-out/k8-py3-opt/bin/tensorflow/contrib/seq2seq/_objs/python/ops/_beam_search_ops_gpu/tensorflow/contrib/seq2seq/kernels/beam_search_ops_gpu.cu.pic.d (No such file or directory)
In file included from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:14:0,
                 from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1,
                 from ./tensorflow/contrib/seq2seq/kernels/beam_search_ops.h:19,
                 from tensorflow/contrib/seq2seq/kernels/beam_search_ops_gpu.cu.cc:20:
external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/Core:59:34: fatal error: math_functions.hpp: No such file or directory

It turns out that in CUDA 9.1, math_functions.hpp is located at cuda/include/crt/math_functions.hpp, rather than cuda/include/math_functions.hpp (CUDA 9.0 does), which leads to this error. ln -s /usr/local/cuda/include/crt/math_functions.hpp /usr/local/cuda/include/math_functions.hpp will fix this problem and complete the compiling process.

Reference

https://stackoverflow.com/a/47807106/2666624

Source code / logs

Traceback is available above.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 26
  • Comments: 20 (9 by maintainers)

Most upvoted comments

Issued PR to Eigen which should resolve this issue.

I’ve merged it. @gunan sure, it will be in rc1

Make a softlink from /usr/local/cuda-9.1/include/crt/math_functions.hpp to /usr/local/cuda-9.1/include/math_functions.hpp can solve the problem for me.

– update – Thanks for @icyblade 's reminder that this method has already been mentioned above.

This is due to eigen. Correct fix is here: https://github.com/tensorflow/tensorflow/pull/15796

@av8ramit Once the PR above is merged can we cherrypick it to the release (just the 2nd commit in the PR is sufficient).

Visit http://www.python36.com/install-tensorflow141-gpu/ for step by step instructions to build tensorflow from source for cuda 9.1 and cudnn 7.0.5 on ubuntu x64.