tensorflow: UnknownError: Failed to get convolution algorithm.

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): NO OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 TensorFlow installed from (source or binary): binary (no errors during installation) TensorFlow version: 2.3.1 Python version: 3.7.5 CUDA/cuDNN version: 10.1 GPU model and memory: RTX 1660 Ti 6.00GB full script output.txt

Describe the current behavior The code below throws the following exception

tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D] Full script’s output is attached. The code is pretty standard, it’s weird to get this error. In the script I also put code that trains model via Keras’s model.fit() - and it works. The error occurs when GradientTape-approach is used. Also the same code (version with training via Keras model.fit() and version with GradiantTape) works fine on GoogleColab https://colab.research.google.com/drive/1fjbvlpEEEm3yvyKhcGyel3vLguEkU9GN?usp=sharing

Standalone code to reproduce the issue


import tensorflow as tf

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = x_train[..., tf.newaxis].astype("float32")
x_test = x_test[..., tf.newaxis].astype("float32")
nb_classes = 10
y_train = tf.keras.utils.to_categorical(y_train, nb_classes)
y_test = tf.keras.utils.to_categorical(y_test, nb_classes)
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32)

# oversimplified model just for example
inputs = tf.keras.layers.Input(shape=(28, 28, 1))
x = tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu')(inputs)
x = tf.keras.layers.MaxPool2D(2)(x)
x = tf.keras.layers.Flatten()(x)
outputs = tf.keras.layers.Dense(nb_classes, activation='softmax')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)

loss_func = tf.keras.losses.CategoricalCrossentropy()
optimizer = tf.keras.optimizers.Adam()

for i, (xx, yy) in enumerate(dataset):
    with tf.GradientTape() as tape:
        y_pred = model(xx)
        loss = loss_func(yy, y_pred)
    grad = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grad, model.trainable_variables))
    print('batch {} processed'.format(i))

# if used this approach to train - everything works
# model.compile(loss=tf.keras.losses.CategoricalCrossentropy(), optimizer=tf.keras.optimizers.Adam())
# model.fit(x_train, y_train, batch_size=128, validation_data=(x_test, y_test))
# model.evaluate(x_test, y_test, batch_size=128, verbose=1)

Other info / logs Full script output is attached.

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 20 (6 by maintainers)

Most upvoted comments

@imintz is looking into possibly fixing this for good.

sanjoy on Oct 12, 2020

Ohk, So My error is cleared when I used the cuDNN v7.6.5 with CUDA 10.1

yashrajjain726 on Oct 15, 2020