tensorflow: keras LSTM Fail to find the dnn implementation
System information
- CUDA/cuDNN version: 10.1
- GPU model and memory: GeForce RTX 2080
- TF 2.1.0:
uncommenting the LSTM layer will yield the following error:
UnknownError: [_Derived_] Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
[[sequential_6/bidirectional_2/backward_lstm_3/StatefulPartitionedCall]]
[[Reshape_11/_38]] [Op:__inference_distributed_function_39046]
working code:
model = tf.keras.Sequential([
tf.keras.layers.Embedding(encoder.vocab_size, 64),
#tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
history = model.fit(train_dataset, epochs=10,
validation_data=test_dataset,
validation_steps=30)
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 50 (7 by maintainers)
@Lay4U @ARozental Please use the below code while importing tensorflow and let me know if the issue still persists. Thanks!
@gowthamkpr It doesn’t help
above work for me.
Just a heads up I had this error but I noticed in the output this error as well
Resolved by updating my conda env with
this resolved the issue in my case… Using:
Same issue here I tried all the aforementioned solutions. None seems to resolve the issue
I got the same problem, which is solved by this. Thanks a lot!
I confirm that it does not help
@Saduf2019 I’m running TF 2.1.0. I don’t think the problem exists in TF1 which is used in the notebook. also making the following change makes the code work:
I would think that those 2 lines should do the same thing (please correct me if I’m wrong) but it seems only the second line works.
RuntimeError: Physical devices cannot be modified after being initialized
Just had the same issue here, managed to fix with this solution
My setup: Windows 10 CUDA 11.2 Tensorflow 2.3 Nvidia Driver 460.x Geforce RTX 2060 Python 3.8
This worked for us when getting
Thanks @ElliotVilhelm
Ok I managed to make it work after fighting with CUDA 10.1 and 10.2 (10.2 works nice with 2.3 nightly) for a while, environments, OS and everything.
Narrowed it to a seeming harmless line
I was running
tf.test.gpu_device_name()
to check there was a GPU and print its name. That command when run at any time makes the model fail on train with the mentioned error:Unknown: Fail to find the dnn implementation
The
tf.config.experimental.set_visible_devices
command that @shaoeChen mentioned didn’t change anything for me so I removed it.I managed to make it work more reliably running this right after importing tensorflow (and other libs, but I don’t think it changes anything)
Is this a known bug or some unintended behaviour?