TinyNeuralNetwork: Converting LiteHRNet pytorch model to TFLite, outputs don't match
Hi, this is really great work, thanks!
I am able to convert the LiteHRNet model to TFLite without running into any issues. However, the outputs don’t match up.
Here is the output from sending ones through the network. Output is of shape [1,17,96,72].
I am just showing here output[0,0,0]
from both pytorch and tflite:
pytorch
array([6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 1.8188522e-04, 1.7515068e-04, 1.9644469e-04,
1.6027213e-04, 1.9049855e-04, 1.5419864e-04, 1.2460010e-04,
9.0751186e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05],
dtype=float32)
tflite
array([6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
1.1580180e-04, 2.3429818e-04, 3.9018277e-04, 7.7823577e-03,
1.8948119e-02, 2.8559987e-02, 3.3612434e-02, 2.5932681e-02,
1.2074142e-02, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05],
dtype=float32)
When I convert to tflite via the onnx route, the outputs do match. So my guess is that some of the transpose/reshapes for NHWC is not happening correctly but I am not sure. Looking for some insight into what would be the best way to debug this?
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 15
I’ve implemented the layerwise comparision but has to organize a little bit. Hopefully these stuff will be uploaded soon.
@simbara Should be fixed by https://github.com/alibaba/TinyNeuralNetwork/commit/8cabdbe0f6b815482d91b242e248ce01df2f6225 and https://github.com/alibaba/TinyNeuralNetwork/commit/2bde8e48c9ee5e6805cd3980d8d7d1a608e61d28. Would you please try again?
@peterjc123 I calculated the difference very similarly to how you did it. But I was using custom model (not the model I uploaded) which is why there is some difference in the numbers.
Again, this is really great work. ONNX doesn’t support a range of layers and other issues. It’s great to see a library like this that converts between the two directly. Congrats!
If you can share how you did the layer-wise comparison I would be very interested in taking a look.
I’ve collected the layer-wise difference and you may see them here. Looks like the major difference is brought by the minor errors accumulated from the batch normalization layers.
I used the following script for comparing difference.
And it involves a configuration file in the json format. I used that so that I could use
convert_from_json.py
to generate the model.Running the script yields