tensorflow: Model Maker Object Detection Tutorial Bug

I ran the Model Maker Object Detection Tutorial via Colab. (https://colab.research.google.com/drive/1DhxMGuQ9ep9mrfDBrFBx47zmOeEOn9_W#scrollTo=qhl8lqVamEty)

However, a problem occurred in model = object_detector.create(train_data, model_spec=spec, batch_size=8, train_whole_model=True, validation_data=validation_data).

Epoch 1/50
---------------------------------------------------------------------------
UnknownError                              Traceback (most recent call last)
<ipython-input-5-187f39c1697e> in <module>()
----> 1 model = object_detector.create(train_data, model_spec=spec, batch_size=8, train_whole_model=True, validation_data=validation_data)

9 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

UnknownError: 2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/efficientnet-lite0/StatefulPartitionedCall/stem/conv2d/Conv2D}}]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/efficientnet-lite0/StatefulPartitionedCall/stem/conv2d/Conv2D}}]]
	 [[Func/cond/then/_3378/input/_6828/_56]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_96849]

Function call stack:
train_function -> train_function

Please solve this problem.

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 3
Comments: 16 (5 by maintainers)

Most upvoted comments

I had the same problem, and @noxhsxrk’s solution fixed it. Thanks!

EdjeElectronics on Oct 16, 2021

I was able to solve it thanks to @noxhsxrk thank you!

tykwon97 on Oct 13, 2021

Okay, I solved this issue. @stist1111 , @tykwon97. I ran on GPU, Trained, Evaluate, Export and test on my Flutter project. I worked perfectly!!😄 Just change TF 2.5.0 to TF 2.4.3

!pip install -q tensorflow==2.4.3

this is my code : https://colab.research.google.com/drive/1SP85b6fRNWbraxOxdTkUZHSEk1nKlveU#scrollTo=qhl8lqVamEty

noxhsxrk on Oct 8, 2021

Thank you for answer. @mohantym Using TF 2.6 can train without any errors But when the training is complete and export model by this code

model.export(export_dir='.', export_format=[ExportFormat.TFLITE, ExportFormat.LABEL])

and run

model.evaluate_tflite('model.tflite', test_data)

it caused an exception

/usr/local/lib/python3.7/dist-packages/six.py in raise_from(value, from_value)

InvalidArgumentError: required broadcastable shapes [Op:Mul]

my code template (https://colab.research.google.com/drive/16YkVjuacsjbgpCSd1IJhU4twX_fSRlGF?authuser=1)

noxhsxrk on Oct 8, 2021

Thank you for answer. @mohantym I use colab. Is it possible to change the cuda and cudnn versions within colab? I did a Google search and couldn’t find it.

tf version : 2.5.0

Results using !nvidia-smi

Thu Oct  7 10:27:23 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   34C    P8    28W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Results using !nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

tykwon97 on Oct 7, 2021