TensorRT: Tensorflow -> ONNX -> TRT not working for Object Detection

Hello everyone,

As suggested by the TRT documentation and Nvidia forums I am trying to move away from UFF into ONNX adoption.

Our workflow is to train on cloud GPUs and then run inference on Jetson Nano (JP44) with Triton.

Unfortunately the recommended way of going Tensorflow -> ONNX -> TensorRT doesn’t work for any of the standard TF Object Detection models, nor the ones we train ourselves.

Reproduction:

Download SSD Mobilenet V2 COCO from TF Zoo. [OK]
Convert the model from savedmodel to ONNX. [OK]

python3 -m tf2onnx.convert --saved-model /workspace/ssd_mobilenet_v2_coco_2018_03_29/saved_model --output /workspace/ssd.onnx --opset 11 --fold_const

Convert ONNX-Optimized model to TRT engine. [FAIL]

I get the following errors:

root@0ad2809d6ee2:/workspace# /usr/src/tensorrt/bin/trtexec --onnx=ssd.onnx 
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=ssd.onnx
[06/05/2020-08:55:54] [I] === Model Options ===
[06/05/2020-08:55:54] [I] Format: ONNX
[06/05/2020-08:55:54] [I] Model: ssd.onnx
[06/05/2020-08:55:54] [I] Output:
[06/05/2020-08:55:54] [I] === Build Options ===
[06/05/2020-08:55:54] [I] Max batch: 1
[06/05/2020-08:55:54] [I] Workspace: 16 MB
[06/05/2020-08:55:54] [I] minTiming: 1
[06/05/2020-08:55:54] [I] avgTiming: 8
[06/05/2020-08:55:54] [I] Precision: FP32
[06/05/2020-08:55:54] [I] Calibration: 
[06/05/2020-08:55:54] [I] Safe mode: Disabled
[06/05/2020-08:55:54] [I] Save engine: 
[06/05/2020-08:55:54] [I] Load engine: 
[06/05/2020-08:55:54] [I] Builder Cache: Enabled
[06/05/2020-08:55:54] [I] NVTX verbosity: 0
[06/05/2020-08:55:54] [I] Inputs format: fp32:CHW
[06/05/2020-08:55:54] [I] Outputs format: fp32:CHW
[06/05/2020-08:55:54] [I] Input build shapes: model
[06/05/2020-08:55:54] [I] Input calibration shapes: model
[06/05/2020-08:55:54] [I] === System Options ===
[06/05/2020-08:55:54] [I] Device: 0
[06/05/2020-08:55:54] [I] DLACore: 
[06/05/2020-08:55:54] [I] Plugins:
[06/05/2020-08:55:54] [I] === Inference Options ===
[06/05/2020-08:55:54] [I] Batch: 1
[06/05/2020-08:55:54] [I] Input inference shapes: model
[06/05/2020-08:55:54] [I] Iterations: 10
[06/05/2020-08:55:54] [I] Duration: 3s (+ 200ms warm up)
[06/05/2020-08:55:54] [I] Sleep time: 0ms
[06/05/2020-08:55:54] [I] Streams: 1
[06/05/2020-08:55:54] [I] ExposeDMA: Disabled
[06/05/2020-08:55:54] [I] Spin-wait: Disabled
[06/05/2020-08:55:54] [I] Multithreading: Disabled
[06/05/2020-08:55:54] [I] CUDA Graph: Disabled
[06/05/2020-08:55:54] [I] Skip inference: Disabled
[06/05/2020-08:55:54] [I] Inputs:
[06/05/2020-08:55:54] [I] === Reporting Options ===
[06/05/2020-08:55:54] [I] Verbose: Disabled
[06/05/2020-08:55:54] [I] Averages: 10 inferences
[06/05/2020-08:55:54] [I] Percentile: 99
[06/05/2020-08:55:54] [I] Dump output: Disabled
[06/05/2020-08:55:54] [I] Profile: Disabled
[06/05/2020-08:55:54] [I] Export timing to JSON file: 
[06/05/2020-08:55:54] [I] Export output to JSON file: 
[06/05/2020-08:55:54] [I] Export profile to JSON file: 
[06/05/2020-08:55:54] [I] 
----------------------------------------------------------------
Input filename:   ssd.onnx
ONNX IR version:  0.0.6
Opset version:    11
Producer name:    tf2onnx
Producer version: 1.6.0
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
Unsupported ONNX data type: UINT8 (2)
ERROR: image_tensor:0:191 In function importInput:
[8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype)
[06/05/2020-08:55:56] [E] Failed to parse onnx file
[06/05/2020-08:55:56] [E] Parsing model failed
[06/05/2020-08:55:56] [E] Engine creation failed
[06/05/2020-08:55:56] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=ssd.onnx

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 28 (1 by maintainers)

Most upvoted comments

Any update on this issue? In case someone still have this error “Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype)” and other error relate to UINT8, INT64, it can be solved by converting format using onnx_graphsurgeon. For example:

import onnx_graphsurgeon as gs
import onnx
import numpy as np
graph=gs.import_onnx(onnx.load('ssd_mobilenet_v2_320x320_coco17_tpu-8.onnx'))
for inp in graph.inputs:
    inp.dtype = np.float32
onnx.save_model(gs.export_onnx(graph), 'outputmodel.onnx')

However, after fixing this type of not supported format, I still face the problem of NonMaxSuppression operator. There are tricks of converting using BatchedNMS_TRT pluggin, something like below:

for node in graph.nodes:
    if node.op == 'NonMaxSuppression':
        node.op = "BatchedNMS_TRT"
        max_output_boxes_per_class, iou_threshold, score_threshold = node.inputs[2:]
        attrs = {
                "shareLocation": True,
                "numClasses": 10,
                "backgroundLabelId": -1,
                "topK": 1024,
                "keepTopK": max_output_boxes_per_class,
                "scoreThreshold": score_threshold,
                "iouThreshold": iou_threshold,
                "isNormalized": True,
                "clipBoxes": True
            }
        node.attrs =attrs
        node.inputs[0].dtype = np.int32
        node.outputs[0].dtype = np.int32
        node.inputs = node.inputs[0:2]

Nevertheless, this still does not work. I guess something todo with the difference in output of BatchedNMS_TRT and original NonMaxSuppression

hoangmt on Apr 12, 2021