tensorflow: __hadd() is ambiguous when EIGEN_CUDA_ARCH >= 530

System information

  • Have I written custom code: I change the CUDA capabilities to 6.1 and 7.0.
  • OS Platform and Distribution: Windows 10
  • TensorFlow installed from (source or binary): I’m compiling the source code.
  • TensorFlow version (use command below): branch r1.8, 8753e2ebde6c58b56675cc19ab7ff83072824a62
  • Python version: 3.6.0
  • Bazel version (if compiling from source): No
  • GCC/Compiler version (if compiling from source): VS 2017(v141), but v140 for CUDA host compiler
  • CUDA/cuDNN version: CUDA 9.0, cuDNN 7.1
  • GPU model and memory: 1080 Ti, Titan V
  • Exact command to reproduce: cmake-gui, enable GPU, and change CUDA host compiler to v140

Describe the problem

__hadd() is ambiguous when EIGEN_CUDA_ARCH >= 530.

The following is where the ambiguity comes from found in VS 2017: image

Source code / logs

tf_core_gpu_kernels compilation fails because of this problem:

42>Building NVCC (Device) object CMakeFiles/tf_core_gpu_kernels.dir/__/__/core/kernels/Release/tf_core_gpu_kernels_generated_check_numerics_op_gpu.cu.cc.obj
42>check_numerics_op_gpu.cu.cc
42>e:\program\ml\tensorflow_build_05-10-01\external\eigen_archive\eigen\src/Core/arch/CUDA/Half.h(212): error : more than one instance of overloaded function "__hadd" matches the argument list:
42>            function "__hadd(int, int)"
42>            function "__hadd(__half, __half)"
42>            argument types are: (const Eigen::half, const Eigen::half)

I’m confused that nobody has every post such an issue. Nobody has ever tried changing CUDA capabilities to >=5.3? Or is there something wrong with my environment?

It seems that this is a pure Eigen issue…

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 8
  • Comments: 25 (17 by maintainers)

Most upvoted comments

Find a bug report on the Eigen forum: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=1526

A patch is posted there which is helpful.

I think eigen upgrade is complete. will close this bug.

@gunan In fact even for Tensorflow v1.12.0 and Bazel 0.15.0 on Windows the bug still exists when building GPU version.

1.12 did not have the eigen version upgrade. I do not think we will backport the fix to 1.12, as it was quite a large and complicated change to incorporate.