tensorflow: tflite runs much slower than tfmobile ...
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu14.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: Xiaomi 8
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): 1.10
- Python version:
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: 9.0 / 7.1
- GPU model and memory:
- Exact command to reproduce:
Describe the problem
I test performance of tf-mobile, tf-lite, tf-mobile-int8, tf-lite-int8 on android, and I find that the speed of tf-lite is much slower than tf-mobile.
-
I use
freeze_graph
to generateA.pb
file fromcheckpoint
for testing tf-mobile performance. -
I use
toco_convert
to convertA.pb
file toA.tflite
file for for testing tf-lite performance. -
I use
transform_graph
to get quantitativeAQ.pb
file fromA.pb
file for testing tf-mobile int8 performance. -
I train a model with the same architecture by adding the line
tf.contrib.quantize.create_training_graph()
and get thecheckpoint
file. Then I replace the line withtf.contrib.quantize.create_eval_graph()
to generate theA.pbtxt
file, and usecheckpoint
file andA.pbtxt
file to getA8.pb
with fake quantization nodes. Finally, I usetoco_convert
to get theA8.tflite
file. -
I test the performance with these 4 files on android, each runs several times for inference on the same image, and the result is listed below:
tf-mobile: 357ms per image tf-mobile int8: 356ms per image tf-lite: 844ms per image tf-lite int8; 571ms per image
I wonder why tf-lite is much slower than tf-mobile.
PS: the model architecture only contains: CONV+BN+RELU, RESHAPE, FULLY-CONTECT ops.
The features shape from CONV+BN+RELU is [B,T,C], then I reshape it to [-1,C] and go on to the fc layer, then reshape the out with shape [B*T,K] to [B,T,K], which is the final result I expected.
I wonder is the reshape op the brings the worse performance ?
Thank you very much …
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 23 (11 by maintainers)
We’re in the process of upstreaming the fix to Eigen, stay tuned.
As for TRANSPOSE_CONV ops, TFLite is still much slower than TFMobile unfortunately. Would you like to check my benchmark report #26736 ?
@jdduke Are there any updates on the fix? Thank you!
It’s not quite there, expect an update in the next week or two. Thanks for your patience.