tensorflow: Model Maker Object Detection Tutorial Bug
I ran the Model Maker Object Detection Tutorial via Colab. (https://colab.research.google.com/drive/1DhxMGuQ9ep9mrfDBrFBx47zmOeEOn9_W#scrollTo=qhl8lqVamEty)
However, a problem occurred in
model = object_detector.create(train_data, model_spec=spec, batch_size=8, train_whole_model=True, validation_data=validation_data).
Epoch 1/50
---------------------------------------------------------------------------
UnknownError Traceback (most recent call last)
<ipython-input-5-187f39c1697e> in <module>()
----> 1 model = object_detector.create(train_data, model_spec=spec, batch_size=8, train_whole_model=True, validation_data=validation_data)
9 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
58 ctx.ensure_initialized()
59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60 inputs, attrs, num_outputs)
61 except core._NotOkStatusException as e:
62 if name is not None:
UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/efficientnet-lite0/StatefulPartitionedCall/stem/conv2d/Conv2D}}]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/efficientnet-lite0/StatefulPartitionedCall/stem/conv2d/Conv2D}}]]
[[Func/cond/then/_3378/input/_6828/_56]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_96849]
Function call stack:
train_function -> train_function
Please solve this problem.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 3
- Comments: 16 (5 by maintainers)
I had the same problem, and @noxhsxrk’s solution fixed it. Thanks!
I was able to solve it thanks to @noxhsxrk thank you!
Okay, I solved this issue. @stist1111 , @tykwon97. I ran on GPU, Trained, Evaluate, Export and test on my Flutter project. I worked perfectly!!😄 Just change TF 2.5.0 to TF 2.4.3
this is my code : https://colab.research.google.com/drive/1SP85b6fRNWbraxOxdTkUZHSEk1nKlveU#scrollTo=qhl8lqVamEty
Thank you for answer. @mohantym Using TF 2.6 can train without any errors But when the training is complete and export model by this code
and run
it caused an exception
my code template (https://colab.research.google.com/drive/16YkVjuacsjbgpCSd1IJhU4twX_fSRlGF?authuser=1)
Thank you for answer. @mohantym I use colab. Is it possible to change the cuda and cudnn versions within colab? I did a Google search and couldn’t find it.
tf version : 2.5.0
Results using !nvidia-smi
Results using !nvcc --version