tensorflow: CUDNN_STATUS_NOT_INITIALIZED error with tensorflow-gpu 2.4.0-rc2 RTX3070 CUDA11.0 cudnn8.0.2 windows10 and pip

Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): windows10
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: –
TensorFlow installed from (source or binary): binary
TensorFlow version: 2.4.0rc2
Python version :3.8.5
Installed using virtualenv? pip? conda?: pip
Bazel version (if compiling from source): –
GCC/Compiler version (if compiling from source): –
CUDA/cuDNN version: CUDA11.0 cudnn-11.0-windows-x64-v8.0.2.39 graphics driver version: 457.30
GPU model and memory: RTX 3070 8G

Describe the problem I install cuda cudnn following the requirement for TensorFlow-gpu 2.4.0-rc2 as shown in https://github.com/tensorflow/tensorflow/releases, but get cudnn initialization error. I have uninstalled and reinstalled for several times, and I also tried TensorFlow-gpu 2.4.0-rc1 several days ago, the same errer was raised.

Provide the exact sequence of commands / steps that you executed before running into the problem The following is the code to re produce the error:

import tensorflow as tf
import numpy as np
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential


x = np.random.normal(size=(100, 28, 28, 1)).astype(np.float32)
y = np.zeros([100, 10], dtype=np.float32)
y[:, 1] = 1.

train_ds = tf.data.Dataset.from_tensor_slices((x, y)).shuffle(buffer_size=100).batch(32)
num_classes = 10

model = Sequential([
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])
model.compile(optimizer='adam',
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
epochs=10
history = model.fit(
  train_ds,
  epochs=epochs
)

Any other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. logs:

2020-11-20 11:38:38.097262: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2020-11-20 11:38:40.566842: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2020-11-20 11:38:40.568068: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2020-11-20 11:38:40.604223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:65:00.0 name: GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2020-11-20 11:38:40.604439: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2020-11-20 11:38:40.618831: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2020-11-20 11:38:40.618938: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2020-11-20 11:38:40.622412: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2020-11-20 11:38:40.623498: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2020-11-20 11:38:40.631377: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2020-11-20 11:38:40.633911: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2020-11-20 11:38:40.634568: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2020-11-20 11:38:40.634734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2020-11-20 11:38:40.635316: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-11-20 11:38:40.636515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:65:00.0 name: GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2020-11-20 11:38:40.637060: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2020-11-20 11:38:40.637713: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2020-11-20 11:38:40.638072: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2020-11-20 11:38:40.638277: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2020-11-20 11:38:40.638448: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2020-11-20 11:38:40.638724: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2020-11-20 11:38:40.638921: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2020-11-20 11:38:40.639189: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2020-11-20 11:38:40.639443: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2020-11-20 11:38:41.262522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-20 11:38:41.262635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2020-11-20 11:38:41.262694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2020-11-20 11:38:41.262900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6177 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:65:00.0, compute capability: 8.6)
2020-11-20 11:38:41.263873: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
Epoch 1/10
2020-11-20 11:38:42.052106: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2020-11-20 11:38:42.135350: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2020-11-20 11:38:42.994775: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2020-11-20 11:38:43.000109: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2020-11-20 11:38:43.875475: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2020-11-20 11:38:43.875620: E tensorflow/stream_executor/cuda/cuda_dnn.cc:340] Error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
2020-11-20 11:38:43.877456: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2020-11-20 11:38:43.877755: E tensorflow/stream_executor/cuda/cuda_dnn.cc:340] Error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
Traceback (most recent call last):
  File "F:/python_ws/helloworld/main.py", line 29, in <module>
    history = model.fit(
  File "C:\Users\dell\Anaconda3\envs\tf24rcpy38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1100, in fit
    tmp_logs = self.train_function(iterator)
  File "C:\Users\dell\Anaconda3\envs\tf24rcpy38\lib\site-packages\tensorflow\python\eager\def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "C:\Users\dell\Anaconda3\envs\tf24rcpy38\lib\site-packages\tensorflow\python\eager\def_function.py", line 888, in _call
    return self._stateless_fn(*args, **kwds)
  File "C:\Users\dell\Anaconda3\envs\tf24rcpy38\lib\site-packages\tensorflow\python\eager\function.py", line 2942, in __call__
    return graph_function._call_flat(
  File "C:\Users\dell\Anaconda3\envs\tf24rcpy38\lib\site-packages\tensorflow\python\eager\function.py", line 1918, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "C:\Users\dell\Anaconda3\envs\tf24rcpy38\lib\site-packages\tensorflow\python\eager\function.py", line 555, in call
    outputs = execute.execute(
  File "C:\Users\dell\Anaconda3\envs\tf24rcpy38\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node sequential/conv2d/Conv2D (defined at F:/python_ws/helloworld/main.py:29) ]] [Op:__inference_train_function_830]

Function call stack:
train_function


Process finished with exit code 1

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 58

Most upvoted comments

Add this to the start of your code physical_devices = tf.config.list_physical_devices('GPU') tf.config.experimental.set_memory_growth(physical_devices[0], True). I tried the code on my system and without it this does not work. After adding the above code it works.

+10

king398 on Nov 23, 2020

I am having the same issues.

After king398’s fix, error CUDNN_STATUS_NOT_INITIALIZED error no longer appears, but epochs do not run.

Windows 10, RTX 3070

Bchi1994 on Nov 23, 2020