TensorRT: INT8 inference in TensorRT 8.0 get wrong answer

I want to convert pytorch model to TensorRT to do INT8 inference, then I do pytorch model -> onnx model -> trt engine, and in TensorRT 7.2.2.3, I succeed. I set fp_mode and int8_mode with old version as builder.fp16_mode=True builder.int8_mode=True in int8_mode, I feed test data to calibrate, and finally I bulid fp32 engine, fp16 engine, int8 engine, and I get right accuracy in all the three mode.

Now I want to apply QAT model to TensorRT, and I update pytorch to 1.8.0, TensorRT to 8.0, cuda 10.2.89, cudnn 8.2.0, first I do INT8 inference in TensorRT as above, but the old version for setting fp16 and int8 mode cannot used, so I use as config.set_flag(trt.BuilderFlag.FP16) config.set_flag(trt.BuilderFlag.INT8) I cannot find pytorch int8 inference sample so I reference int8_caffe_mnist sample (https://github.com/NVIDIA/TensorRT/blob/master/samples/python/int8_caffe_mnist) to calibrate my test data with Int8EntropyCalibrator2 method. config.int8_calibrator = calib I bulid trt engine successfully and get the right accuracy with fp32 and fp16 mode, but wrong accuracy in in8 mode, 0.01 acc for 100-classification, I check the engine output, and find the matrix values are almost about 0.002.

Then I use pytorch_quantization toolkit, do PTQ collect_stats(model, data_loader, num_batches=2) compute_amax(model, method="percentile", percentile=99.99) and I export model to onnx, paser it, build int8 engine for inference, I also get the 0.01 accuracy. the difference between int8 mode and fp16 mode is the config.set_flag and config.int8_calibrator in int8 mode. Why I get right accuracy in fp16 mode but 0.01 acc in int8 mode?

I don’t know what’s the problem, calibration method? calibration batch? @ttyio appreciate your reply, thanks a lot!

Environment: TensorRT Version: 8.0.0.3 NVIDIA GPU:Tesla P40 NVIDIA Driver Version: CUDA Version: 10.2.89 CUDNN Version: 8.2.0 Operating System: Ubuntu 19.10

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 38

Most upvoted comments

@Mahsa1994 Could you share the exported ONNX file? Thanks

@Ricardosuzaku @aojue1109 Does this issue still exist with latest TRT release? If it does, we will debug it. Thanks