TensorRT: Pytorch -> ONNX -> TensorRT incorrect TensorRT results
I have a trained pytorch model, UNET type architecture that I’ve already trained. I can convert to ONNX successfully and inference results, although slightly different from pytorch, are very similar. However, when I convert from ONNX to TensorRT, the results are more different and results in an incorrect segmentation.
Pytorch and onnx were both compiled from source to ensure the same cuda and cudnn libraries were used as packaged in tensorrt. The model was, however, trained a few months ago on another machine that had the pip installed version of pytorch.
Environment
TensorRT Version: TensorRT-7.2.0.14 GPU Type: 2070 Max-Q Nvidia Driver Version: 450.51.06 CUDA Version: 11.0 (update 1) CUDNN Version: 8.0.2 Operating System + Version: ubuntu 18.04 Python Version (if applicable): 3.7.7 TensorFlow Version (if applicable): PyTorch Version (if applicable): 1.7.0 Baremetal or Container (which commit + image + tag):
Steps To Reproduce
I can upload a notebook in a bit, but essentially, the following model is loaded:
Inference results with pytorch (correct results):
Model is converted to onnx
with:
orch.onnx.export(model,
X.cuda(),
'model.onnx',
export_params=True,
opset_version=11, # Needed for upsample operation
verbose=False)
Inference results:
Note that the topleft pixel value is slightly different from pytorch’s:
5.532728672027588
-> 5.532743453979492
But overall the segmentation mask is about the same.
Now, for tensort, the model is converted, serialized, then deserialized:
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)
model_path = 'model.onnx'
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
builder = trt.Builder(TRT_LOGGER)
network = builder.create_network(EXPLICIT_BATCH)
parser = trt.OnnxParser(network, TRT_LOGGER)
parser.parse_from_file(model_path)
config = builder.create_builder_config()
config.max_workspace_size = 1 << 20
engine = builder.build_engine(network, config)
engine_path = 'model.engine'
with open(engine_path, 'wb') as f:
f.write(bytearray(engine.serialize()))
runtime = trt.Runtime(TRT_LOGGER)
with open(engine_path, 'rb') as f:
engine = runtime.deserialize_cuda_engine(f.read())
h_input = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(0)), dtype=np.float32)
h_output = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(1)), dtype=np.float32)
d_input = cuda.mem_alloc(h_input.nbytes)
d_output = cuda.mem_alloc(h_output.nbytes)
stream = cuda.Stream()
context = engine.create_execution_context()
h_input[:] = np.fromfile('X.raw', dtype=np.float32) # Copy preprocessed data to pagelocked memeory
cuda.memcpy_htod_async(d_input, h_input, stream) # Transfer input data to the GPU.
context.execute_async(bindings=[int(d_input), int(d_output)], stream_handle=stream.handle) # Run inference.
cuda.memcpy_dtoh_async(h_output, d_output, stream) # Transfer predictions back from the GPU.
stream.synchronize() # Synchronize the stream
h_output = h_output.reshape(engine.get_binding_shape(1))
I checked terminal output and didnt see any error messages. The inference results I got were:
The inference mask is now wrong (added green blob) and also that topleft value in the mask output is:
pytorch = 5.532728672027588
onnx = 5.532743453979492
tensortt = 5.454630374908447
So now that value is noticeably different, which is concerning because all precision should still be fp32.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17
Hello @Ekta246 ,
We keep improving the onnx parser in TRT, but sorry I cannot tell if 7.2 can fix the zeros you see, because it is not root cause yet.
The Polygraphy is a tool to compare the TRT result with other framework result. After install you can also get a commandline tool E.g, To compare the output between onnxruntime and trt, you can run
To mark all nodes in onnx as output and compare between onnxruntime and trt, you can run
And if you find mark all layer as output is too slow to run, then you can manually edit your onnxfile, you can use onnx-graphsurgeon, or you can use other favorite tools