TensorRT: TensorRT fails to build engine from pytorch_quantization ONNX
Description
I created a quantized model in pytorch using pytorch_quantization
and exported it to ONNX.
Then, I executed the following command on Jetson Orin:
/usr/src/tensorrt/bin/trtexec --onnx=model_quantized.onnx --int8 --saveEngine=model_quantized.trt
Here is part of the trtexec
output that includes the error:
[12/31/2023-11:17:12] [I] Start parsing network model
[12/31/2023-11:17:12] [I] [TRT] ----------------------------------------------------------------
[12/31/2023-11:17:12] [I] [TRT] Input filename: model_quantized.onnx
[12/31/2023-11:17:12] [I] [TRT] ONNX IR version: 0.0.7
[12/31/2023-11:17:12] [I] [TRT] Opset version: 13
[12/31/2023-11:17:12] [I] [TRT] Producer name: pytorch
[12/31/2023-11:17:12] [I] [TRT] Producer version: 1.12.1
[12/31/2023-11:17:12] [I] [TRT] Domain:
[12/31/2023-11:17:12] [I] [TRT] Model version: 0
[12/31/2023-11:17:12] [I] [TRT] Doc string:
[12/31/2023-11:17:12] [I] [TRT] ----------------------------------------------------------------
[12/31/2023-11:17:12] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[12/31/2023-11:17:13] [I] Finish parsing network model
[12/31/2023-11:17:13] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
[12/31/2023-11:17:13] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
[12/31/2023-11:17:13] [E] Error[2]: [qdqGraphOptimizer.cpp::matchInt8ConstantDQ::3582] Error Code 2: Internal Error (onnx::QuantizeLinear_898: Int8 constant is only allowed before DQ node)
[12/31/2023-11:17:13] [E] Error[2]: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[12/31/2023-11:17:13] [E] Engine could not be created from network
[12/31/2023-11:17:13] [E] Building engine failed
[12/31/2023-11:17:13] [E] Failed to create engine from model or file.
[12/31/2023-11:17:13] [E] Engine set up failed
The error refers to the node QuantizeLinear_898
and the error is Int8 constant is only allowed before DQ node
.
Looking at the ONNX graph, I can see that there is a node related to QuantizeLinear_898
that has no input:
Any idea what went wrong and how to solve it?
Environment
Model compilation:
TensorRT Version: TensorRT v8502 (Jetson Orin)
Model quantization and export to ONNX:
OS: Windows 10 Python Version (if applicable): 3.9.12 PyTorch Version (if applicable): 1.12.1+cu116 pytorch_quantization version: 2.1.3
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Comments: 18
Please wait for the TRT 10 release, I guess EA will come out in March/April