onnx2tf: Yolov7-tiny to TensorflowLite conversion results in a dynamic output model incompatible with TfLite Java API

Issue Type

Others

onnx2tf version number

1.5.36

onnx version number

1.12.0

tensorflow version number

2.10.1

Download URL for ONNX

pip install onnx==1.12.0

Parameter Replacement JSON

none

Description

Hi, your library is awesome!

I converted the Yolov7-tiny from PyTorch to TfLite using: onnx2tf -i yolov7-tiny.onnx -o models-NHWC-final/ -osd -oh5 -cotof

I am trying to use it on an android device. The model works when tested on a PC however the Tensorflow Java API for Android does not support dynamic output models according to their documentation: https://www.tensorflow.org/lite/guide/inference. While the resulting yolo tflite model has a dynamic number of outputs( the number of outputs change with the number of indications/detentions)

On the other hand, if I follow the conversion path PyTorch -> ONNX-> Tensorflow I do get yolov7 with a fixed output size so I suspect it is possible to achieve this with onnx2tf as well while also doing the NcHW to NHWc conversion in the process.

Is there a way to have onnx2tf output a fixed/static output .tflite model for yolov7-tiny?

Thank you

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 34 (18 by maintainers)

Most upvoted comments

You’re amazing, you fixed it! Maybe in the future the GPU Delegate will be better supported as well.

Thank you very much!

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/gpu/gl/kernels

Tensorflow GPU delegates supports only limited operations. I guess it is almost impossible to implement NMS using those. It looks even tf.range is not supported, which makes almost impossible to implement neccesary top-k or sort action for NMS.

Screenshot from 2023-02-02 17-26-34

There are two options for now.

  1. If final goal is only replacing NonMaxSuppression to other operations, it is possible by using tf.image.non_max_suppression_padded. The implementation using several sub-operations unlike non_max_suppression_v4 as shown above.
  1. If you want only static output shape, non_max_suppression_v4 has pad_to_max_output_size option. For now, NonMaxSuppression.py passing False and using slice to remove excess indices. After making option for static output for NMS, it is possible to make user to determine maximum box number.

If the GPU support is not there let’s have it running on the CPU and see if it is comparable with yolov4 in terms of computational performance. And yolov4 did pretty well.

The paper mentions that there is a significant computational reduction happening in yolov7 compared to v4

Your implementation details are awesome.

About the API I am using the Java version of tensorflow-lite-gpu. Here is my TensorFlow dependency list: implementation "org.tensorflow:tensorflow-lite:${tflite_version}" implementation "org.tensorflow:tensorflow-lite-gpu:${tflite_version}" implementation "org.tensorflow:tensorflow-lite-gpu-api:${tflite_version}" Here is a good guide to tfLite for Android: https://www.tensorflow.org/lite/guide/inference

In the yolov7 case our output tensor has the shape [num_boxes, 7]. The result for each box is an array of size 7 that contains:

[batch_number, boxLeftLimitX , boxTopLimitY , boxRightLimitX , boxBotomLimitY, ClassID, ClassScore]

The model outputs the top “num_boxes” results ordered in terms of the “ClassScore”.

Thank you for your effort!

OK. then I will look into how to add the feature as a special option.

Well if all these models have a dynamic output they can not be used as a .tflite model on an Android platform. So the conversion to .tflite would not be very useful without a static output.

Knowing that there are models for which it is very difficult to convert their output to be static, what if this conversion option is offered for the many models that do not have a lot of processing behind the NMS block but not necessarily for the other models?

Basically we have a parameter that allows us to fix the output size for the majority of simpler models (in regards to their NMS aspect) so they can be used as .tflite models on Androids. And to let the users know what to expect when they use this parameter a warning is displayed that lets them know that not all models can reasonably have their output fixed. This way you have most utility embedded in your library while avoiding the problems of misuse.