tensorflow: UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win10
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): TF 2.0 alpha0
  • Python version: 3.6
  • Bazel version (if compiling from source): I don’t know
  • GCC/Compiler version (if compiling from source): I don’t know
  • CUDA/cuDNN version: cudnn 7.3.1, Cuda 10.0.1
  • GPU model and memory: Titan RTX 24GB

You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with: 1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" 2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the current behavior I just built one conv2d layer and gave it a random input, that’s all. Throws error above. I’m using Anaconda. I installed tf-gpu 2.0 alpha0 via command line, cause Anaconda doesn’t offer it.

Describe the expected behavior

Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. layer = tf.keras.layers.Conv2D(2, (3,3)) layer(np.random.rand(5,5,3,1))

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.


UnknownError Traceback (most recent call last) <ipython-input-4-7d2f5dcef289> in <module> ----> 1 layer(np.random.rand(5,5,3,1))

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in call(self, inputs, *args, **kwargs) 658 with base_layer_utils.autocast_context_manager( 659 input_list, self._mixed_precision_policy.should_cast_variables): –> 660 outputs = self.call(inputs, *args, **kwargs) 661 self._handle_activity_regularization(inputs, outputs) 662 self._set_mask_metadata(inputs, outputs, previous_mask)

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\keras\layers\convolutional.py in call(self, inputs) 194 195 def call(self, inputs): –> 196 outputs = self._convolution_op(inputs, self.kernel) 197 198 if self.use_bias:

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\ops\nn_ops.py in call(self, inp, filter) 1076 1077 def call(self, inp, filter): # pylint: disable=redefined-builtin -> 1078 return self.conv_op(inp, filter) 1079 1080

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\ops\nn_ops.py in call(self, inp, filter) 632 633 def call(self, inp, filter): # pylint: disable=redefined-builtin –> 634 return self.call(inp, filter) 635 636

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\ops\nn_ops.py in call(self, inp, filter) 231 padding=self.padding, 232 data_format=self.data_format, –> 233 name=self.name) 234 235

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\ops\nn_ops.py in conv2d(input, filter, strides, padding, use_cudnn_on_gpu, data_format, dilations, name, filters) 1949 data_format=data_format, 1950 dilations=dilations, -> 1951 name=name) 1952 1953

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py in conv2d(input, filter, strides, padding, use_cudnn_on_gpu, explicit_paddings, data_format, dilations, name) 1119 input, filter, strides=strides, use_cudnn_on_gpu=use_cudnn_on_gpu, 1120 padding=padding, explicit_paddings=explicit_paddings, -> 1121 data_format=data_format, dilations=dilations, name=name, ctx=_ctx) 1122 except _core._SymbolicException: 1123 pass # Add nodes to the TensorFlow graph.

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py in conv2d_eager_fallback(input, filter, strides, padding, use_cudnn_on_gpu, explicit_paddings, data_format, dilations, name, ctx) 1218 explicit_paddings, “data_format”, data_format, “dilations”, dilations) 1219 _result = _execute.execute(b"Conv2D", 1, inputs=_inputs_flat, attrs=_attrs, -> 1220 ctx=_ctx, name=name) 1221 _execute.record_gradient( 1222 “Conv2D”, _inputs_flat, _attrs, _result, name)

~.conda\envs\alphagpu\lib\site-packages\tensorflow\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 64 else: 65 message = e.message —> 66 six.raise_from(core._status_to_exception(e.code, message), None) 67 except TypeError as e: 68 if any(ops._is_keras_symbolic_tensor(x) for x in inputs):

~.conda\envs\alphagpu\lib\site-packages\six.py in raise_from(value, from_value)

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 2
  • Comments: 32

Most upvoted comments

I was facing the same issue with tensorflow-gpu version 2.0-beta, and figured this is definitely a tensorflow-cuda-cudnn version mismatch problem. The issue occuered because my default Python came from Anaconda, whereas I was installing CUDA/CUDNN as instructed on TF-gpu installation page. I installed the latest versions of both CUDA and CUDNN, but, every time I checked the versions shown were older ones, and not the ones I had installed. So I installed everything again using conda instead, and it seemed to work flawlessly.

Tensorflow version – 2.0.0-beta1 CUDA version – 10.1.168 CUDNN version – 7.6.0

Use the following command to check CUDA installation by Conda:

conda list cudatoolkit

And the following command to check CUDNN version installed by conda:

conda list cudnn

If you want to install/update CUDA and CUDNN through CONDA, please use the following commands:

conda install -c anaconda cudatoolkit
conda install -c anaconda cudnn

Alternatively you can use following commands to check CUDA installation:

nvidia-smi OR nvcc --version

If you are using tensorflow-gpu through Anaconda package (You can verify this by simply opening Python in console and check if the default python shows Anaconda, Inc. when it starts, or you can run which python and check the location), then manually installing CUDA and CUDNN will most probably not work. You will have to update through conda instead.

If you want to install CUDA, CUDNN, or tensorflow-gpu manually, you can check out the instructions here https://www.tensorflow.org/install/gpu

I downgraded cuDnn from 7.6.2 to 7.4.1 and restarted. the problem solved. my cuda is 10.0 always. but the tensorflow page says [cuDNN SDK (>= 7.4.1)] So I installed cuDNN 7.6.2 at first and worked fine before I started to use CNN.

So this should be [cuDNN SDK (= 7.4.1)]

@dayangkunurfaizah please try with the following settings:

from tensorflow.compat.v1.keras.backend import set_session
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True  # dynamically grow the memory used on the GPU
config.log_device_placement = True  # to log device placement (on which device the operation ran)
sess = tf.compat.v1.Session(config=config)
set_session(sess)

This error is for sure due to version mismatch of Cuda/cudnn. When I got this error I followed the instructions on this link https://www.tensorflow.org/install/gpu and it solved the problem. Have you tried that?

gpus = tf.config.experimental.list_physical_devices(‘GPU’) if gpus: try: for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True) except RuntimeError as e: print(e) #Just paste these lines to solve the problem