onnx2tf: [YOLOX][YOLOv8] INT8 wrong output
Issue Type
Others
onnx2tf version number
1.7.23
onnx version number
1.13.1
tensorflow version number
2.12.0rc1
Download URL for ONNX
ONNX YOLOX-nano model was generated using https://github.com/Megvii-BaseDetection/YOLOX/blob/main/tools/export_onnx.py
Parameter Replacement JSON
No parameter replacement
Description
- Research. Core project need
- The export succeeds but the results are 0
- I tried all the possible flags for the ONNX2TF generation
- Core project need
I have managed to generate : dynamic_range_quant
, full_integer_quant
and integer_quant
versions of YOLOX using onn2tf. I also built a multi-backend class that supported inference of all of the exported models to achieve a meaningful comparison by using exactly the same evaluation pipeline available in the YOLOX repo for all of them. My results are as follow:
Model | size | mAPval 0.5:0.95 |
mAPval 0.5 |
---|---|---|---|
YOLOX-nano PyTorch (original model) | 416 | 0.256 | 0.411 |
YOLOX-nano ONNX | 416 | 0.256 | 0.411 |
YOLOX-nano TFLite FP16 | 416 | 0.256 | 0.411 |
YOLOX-nano TFLite FP32 | 416 | 0.256 | 0.411 |
YOLOX-nano TFLite full_integer_quant | 416 | 0 | 0 |
YOLOX-nano TFLite dynamic_range_quant | 416 | 0 | 0 |
YOLOX-nano TFLite integer_quant | 416 | 0 | 0 |
The output of the quantized models seem to be wrong as the postprocessing step fails. The confidences are so low that none of the predictions pass through the confidence filtering in the NMS process. Any idea what could be the problem? The float16
and float32
TFLite models works as usual, achieving the result in the table above. Anybody tried onn2tf
with YOLOX and got the quantized models working?
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 62 (59 by maintainers)
I am closing this somewhat lengthy topic because I have found the cause of the problem.
I don’t know if it may be useful, but 2 months ago I successfully converted ultralytics/yolov5 to .tflite int8, and the export script uses the same TFLiteConverter. It had good performances in inference.
A similar issue has been posted on the official TensorFlow issue. You can find several by searching for “INT8.” It seems that even very simple
mobilenet_v2
corrupts the model. I actually tried this morning and sure enough, the model broke onmobilenet_v2
.Ref: https://github.com/tensorflow/tensorflow/issues/52357
I guess it depends on whether the TensorFlow team can handle it.
Other than MobileNet-SSD, which I tested and successfully deduced four years ago, there are no other models that I can think of at this time.
YOLOv8 uses a similar architecture, but I am not sure if INT8 can be used successfully.
https://github.com/PINTO0309/PINTO_model_zoo/tree/main/345_YOLOv8
I think so. It would be better to link to several issues that might be relevant and show the enormity of the problem. This problem appears to be quite critical. Also, I don’t think this problem occurs only in the PyTorch -> ONNX -> TFLite conversion flow. Because in onnx2tf I am just building Keras models in a normal sequential way. If you include the topic of converting models from PyTorch or ONNX, the TensorFlow team may ignore you.
As discussed in this issue, it is always a good idea to convert the model once and make sure that the accuracy degradation is not significant or acceptable. Unless we know where the cause of the extremely large INT8 accuracy degradation lies, I don’t see how we can do anything but try at random.
It doesn’t work properly, so we have no choice. I do not want to use unofficial tools unnecessarily. I am simply suggesting the best and most practical method for the current operation at any given time.
I cannot thank you enough for all this
I just used this logic to check steadily.
https://github.com/PINTO0309/onnx2tf/issues/244#issuecomment-1465230129
The
--cotof
option confirms that there is a discrepancy, but does not show the specific value. Also, the tool does not check the accuracy of the INT8 model.Yup. That worked. Thanks
When I run
on my ONNX model I get an error:
Got it!
TensorFlow is not perfect. Therefore, there must be a point where the output value breaks down significantly at the boundary of some operation.
I am only now steadily narrowing down the problem areas to identify them.
I am not sure what the problem is, but at this point I know the following for certain
Completely corrupted output values.
It still looks fine.
I see. Although the error is larger than in
1e-1
, it seems that the fatal breakdown of the model is a bit further back.The model seems to break before the first
Concat
. Things are getting interesting.I compared the code-block corresponding to the output decoding into bboxes which comes from here with the corresponding operations in netron I posted here and it looks correct to me.
Apparently, the INT8 model is not entirely broken.
Thanks again for you time @PINTO0309. I actually also tried lowering TF even more which led to the same results. Let me know if can help out with something 😄
Later I will check the weights and quantization parameters of the model. I have no idea what the cause is at this point.
The version I tried is below. All did not go well. v2.12.0rc1 v2.11.0 v2.10.0 v2.9.0
It’s been about 2 years since I’ve tested a quantization model, so I’m going to look back into the past a bit. I may have overlooked something.
Given that the
Float16
and32
models work I cannot understand why the quantized ones do not when generated by the same export call