TensorRT: [castLayer.cpp::validate::33] Error Code 2: Internal Error (Assertion !mOutputTypes.at(0).hasValue() || mOutputTypes.at(0).value() == params.toType failed. )

Description

I tried to convert a model based on dalle2 clip’s text encoder model, I first convert to onnx, and then to trt. I am trying with only single batch, and I also did polygraphy surgeon sanitize the onnx model. Regardless if I sanitize it or not, I get the error in the title. Attached the full log as well. I have no clue how to debug this, could not even locate castLayer.cpp. Would appreciate any pointers, thanks!

Environment

TensorRT Version: 8.6.1

NVIDIA GPU: A10G, Cuda 12.1

NVIDIA Driver Version: 525

CUDA Version: 12.1

CUDNN Version: 8.8.0

Operating System: Ubuntu-20:04

Python Version (if applicable): 3.10

Tensorflow Version (if applicable): NA

PyTorch Version (if applicable): 2.3.0.dev20240117 (nightly)

Baremetal or Container (if so, version): nvcr.io/nvidia/tensorrt:23.12-py3

Relevant Files

Model link: error_log.txt Model is 810M, same model architecture as Dalle Clip text encoder. It doesn’t allow me to upload.

Steps To Reproduce

Commands or scripts: To Onnx: onnx_model = torch.onnx.export(encode_text, (tokenized_text), "text_encoder.onnx", export_params=True, input_names=['text'], output_names=['text_feature'], ) (Optional): polygraphy surgeon sanitize text_encoder.onnx --fold-constants -o folded.onnx To Trt: trtexec --onnx=folded.onnx --fp16 --saveEngine=model.trt --precisionConstraints=prefer --layerPrecisions=*:fp16,*:fp32 --layerOutputTypes=*:fp16,*:fp32 --verbos

Have you tried the latest release?: I used 8.6.1 is the latest

About this issue

  • Original URL
  • State: open
  • Created 5 months ago
  • Comments: 20

Most upvoted comments

Another thing I observe very weird is that when I do trtexec --onnx=model.onnx --fp16, the output trt model inference results are very wrong; if I get rid of --fp16. the results align with original pytorch model or onnx mdoel. Both the original pytroch model and the onnx model are in fp16. Is this expected?

When working with dynamic shapes, you need to specify runtime dimensions. See docs here: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#runtime_dimensions