tensorflow: Fail to build TF 1.15 on Cuda 11.1

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows
  • TensorFlow installed from (source or binary): source
  • TensorFlow version: 1.15
  • Python version: 3.6
  • Installed using virtualenv? pip? conda?: pip
  • Bazel version (if compiling from source): 0.25.3
  • GCC/Compiler version (if compiling from source): MSVC 2017
  • CUDA/cuDNN version: 11.1 / 7.6.0
  • GPU model and memory: RTX 2080 TI

Describe the problem

unable to build TF 1.15 on Cuda 11.1

Any other info / logs

Execution platform: @bazel_tools//platforms:host_platform
tensorflow/core/kernels/cuda_sparse.cc(212): error C2065: 'cusparseSgtsv': undec
lared identifier
tensorflow/core/kernels/cuda_sparse.cc(212): error C2065: 'cusparseDgtsv': undec
lared identifier
tensorflow/core/kernels/cuda_sparse.cc(212): error C2065: 'cusparseCgtsv': undec
lared identifier
tensorflow/core/kernels/cuda_sparse.cc(212): error C2065: 'cusparseZgtsv': undec
lared identifier
tensorflow/core/kernels/cuda_sparse.cc(224): error C2065: 'cusparseSgtsv_nopivot
': undeclared identifier
tensorflow/core/kernels/cuda_sparse.cc(224): error C2065: 'cusparseDgtsv_nopivot
': undeclared identifier
tensorflow/core/kernels/cuda_sparse.cc(224): error C2065: 'cusparseCgtsv_nopivot
': undeclared identifier
tensorflow/core/kernels/cuda_sparse.cc(224): error C2065: 'cusparseZgtsv_nopivot
': undeclared identifier
tensorflow/core/kernels/cuda_sparse.cc(250): error C2065: 'cusparseSgtsvStridedB
atch': undeclared identifier
tensorflow/core/kernels/cuda_sparse.cc(250): error C2065: 'cusparseDgtsvStridedB
atch': undeclared identifier
tensorflow/core/kernels/cuda_sparse.cc(250): error C2065: 'cusparseCgtsvStridedB
atch': undeclared identifier
tensorflow/core/kernels/cuda_sparse.cc(250): error C2065: 'cusparseZgtsvStridedB
atch': undeclared identifier

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 22 (7 by maintainers)

Most upvoted comments

My big app is created using TF 1

I cannot upgrade it to TF 2 API, because it requires a lot of modifications and testings from scratch, which is time and money consuming task.

Seems like Tensorflow™ cannot provide backward compatibility for new CUDA versions. So even 2 years old app will not support new cards. It is serious impact to business and companies which are using TF. Where is the guarantee that it will not happen again?

I am already very sorry that I did not choose pytorch at first. Burn in hell google, die tensorflow.

To use tensorflow 1.1x on CUDA11.x, I think you should use nvidia-tensorflow. The installation is quite simple. (https://developer.nvidia.com/blog/accelerating-tensorflow-on-a100-gpus/)

Install NVIDIA drivers(455.23). After installing it check the status of GPU using nvidia-smi. Then install tf-1.15 as follows:

sudo apt update
sudo apt install -y python3-dev python3-pip git
pip3 install --upgrade pip setuptools requests

pip install -U virtualenv
virtualenv --system-site-packages -p python3 /venv
source /venv/bin/activate

pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]

This should install tf-1.15 with cuda 11.1 support. Test it as follows: python -c 'import tensorflow as tf; print(tf.__version__)'

python -c "import tensorflow as tf; print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))"

I think you should use nvidia-tensorflow. (https://developer.nvidia.com/blog/accelerating-tensorflow-on-a100-gpus/)

Any possible ways to compile working libtensorflow.dll with it? I installed cuda 10.0 to system with 3070 and tensorflow.dll 1.15 that i’m currently using doesn’t give me any errors, but initialization and inference are extremely slow.