onnx-tensorrt: Unsupported ONNX data type: UINT8 (2)
Folowing the tutorial from the notebook https://github.com/onnx/tensorflow-onnx/blob/master/tutorials/ConvertingSSDMobilenetToONNX.ipynb I am trying to work with a mobilenetv2 and v3 frozen models from tensorflow frozen_inference_graph.pb or a saved_model.pb to convert to ONNX and to TensorRT files. Under NGC dockers 20.01-tf1-py3 and 19.05-py3 I am using both this and tensorflow-onnx projects. I alwaysget different issues, the furthest I got was under 20.01-tf1-py3 with both onnx-tensorrt and tensorflow-onnx on master branchs and install the projects from source. I was able to create the .onnx file, but when I try to create the .trt file I get the following.
onnx2trt /media/bnascimento/project/frozen_inference_graph.onnx -o /media/bnascimento/project/frozen_inference_graph.trt
----------------------------------------------------------------
Input filename: /media/bnascimento/project/frozen_inference_graph.onnx
ONNX IR version: 0.0.6
Opset version: 10
Producer name: tf2onnx
Producer version: 1.6.0
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
Parsing model
Unsupported ONNX data type: UINT8 (2)
ERROR: image_tensor:0:190 In function importInput:
[8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype)
I suspect this has to do with the input tensor for the image, but I dont know how to avoid this issue. Anyone with similar issues before?
Cheers Bruno
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 15
- Comments: 83
I have the same error with my code. I find a tool can solve the problem here. I find the way here.
Here are the steps I did, but ended up failing anyway.
Step 1: fix UINT8 error
Here is a script that generates a new frozen inference graph with float inputs from one with int inputs:
Suppose it’s called
fix_uint8.py. Its usage is:python fix_uint8.py frozen_inference_graph.pb fixed_inference_graph.pbStep 2: generate ONNX file from fixed .pb file
Let’s say I fixed a file and called it
mobilenet_v2_0.35_128.pb. I then call tf2onnx on this file:Step 3: generate TensorRT “engine” from ONNX file
Lastly, I call onnx2trt:
I’ve trained my network using TF 1.15, but I get this error even when I execute these steps with either TF 2.3 or 1.15.
@cognitiveRobot I ditched TensorRT and the Jetson and did inference in an Intel NUC, directly in the CPU.
???
???
Late to this thread, but it looks like there’s a few issues:
Hey guys I too had this same problem and maybe this script can help as it helped me
Here we can change the data type of the input tensor. Resource: https://forums.developer.nvidia.com/t/unsupported-onnx-data-type-uint8-2/75044/16?u=karanprojectx
First, I have to say that I haven’t had this janky experience with software in years. Working with this ONNX and TensorRT ecosystem is a complete nightmare.
Second, I was able to solve the UINT8 problem by using the code from this NVIDIA Developers forum post: https://forums.developer.nvidia.com/t/problem-converting-onnx-model-to-tensorrt-engine-for-ssd-mobilenet-v2/139337/16
This fixes the original
frozen_inference_graph.pbfile, which then needs to be converted to ONNX and then to TensorRT.Well, I have ran more tests… looks like you right about volume sent to GPU being the same.
I’m still not sure where actual conversion took place, but calling
inputs[0].host[:allocate_place] = input.flatten(order="C")with uint8 dtype seems to write data to either part of preallocated buffer, either indeed calls type cast under the hood, but amount of data sent to GPU is same for both cases, with a little increase in uint8 case, due to increased CPU performance you pointed out before.I have measured pcie<=>GPU Tx/Rx speeds with
nvidia-smi dmon -s tand there is actual 4x difference when I’m usingonnxruntime-gpuwith model with uint8 inputs vs TensorRT with or without proposed fix.Though now more questions arise in my usage scenario: actual performance gain per GPU in my case is about 5-10%, which might be explained by your previous tests. But it shouldn’t lead to overall performance increase of up to 2x in multiple GPU scenarios, especially that with this fix CPU usage actually had grown along with performance.
+1 for natively supporting UINT8. It’s really bizarre that the format used in almost all image source data is not supported.
@cognitiveRobot oh boy oh boy, do I have answers for you. I trained all MobileNetV2 and V3 models from this page with a width multiplier of 1 or less to detect a single class (soccer balls). I then collected the mean inference time for a single frame on a 30 second video, both in a Tesla V100 GPU and an Intel i5-4210U. You can see the results below.
The i5 is between 1.3 and 1.5 times slower than the V100, but you have to be aware that this depends a lot on the implementation. The TF Object Detection API is pretty fast for inference in CPUs. On the other hand, the official YOLOv4 has an inference time of 50 ms on the V100 and a whooping 5 seconds on our feeble CPU.
As for the inference time when processing images of different sizes:
Just bear in mind that the MobileNets already scale down images before processing them, so it may be a good idea for you to configure your camera/input feed to have low resolutions too. It should matter little for the network.
I meet same problem. This is my solution. When call forward method of RoBERTa, if position_ids is None, HuggingFace will call create_position_ids_from_input_ids method modeling RoBERTa to generate position_ids. Inside this function, they use torch.cumsum method. In order to fix bug involve TensorRT not support convert CumSum operator, you need to generate position_ids.
Would be nice if NVIDIA made this easier. Many people are using TF Object Detection. I’m trying to run it on Jetson.
cc @deadeyegoodwin