tensorflow: NNAPI Delegate BUG

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: Xiaomi 9Pro (SnapDragon 855+), Android Q
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.15.0
Python version: 3.7.9
Bazel version (if compiling from source):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:

Describe the current behavior

I have several Questions.

1. When I run the model mobilenet_v1_1.0_224_quant.tflite in tflite examples, and I select NNAPI device, there are log output as followings in LOGCAT. All layers of the model are supported by NNAPI, so what does these outputs mean?

2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.028 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {AVERAGE_POOLING_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {RESHAPE, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.029 12125-12127/? E/hta-unnhal: {SOFTMAX, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.031 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {AVERAGE_POOLING_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {RESHAPE, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.032 12125-12127/? E/hta-unnhal: {SOFTMAX, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {DEPTHWISE_CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {AVERAGE_POOLING_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {CONVOLUTION_2D, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {RESHAPE, TENSOR_QUANT8} is not supported.
2020-09-24 20:54:57.034 12125-12127/? E/hta-unnhal: {SOFTMAX, TENSOR_QUANT8} is not supported.

2. When I run the model efficientnet-lite0-int8.tflite in the examples, I find that the performance using NNAPI is much slower than using CPU. However, all the layers are supported by NNAPI. So Why ?

3. About NNAPI. The following two APIs can not be set in the meantime, otherwise the error log occurs. So which API is valid，or both are valid but can not be set in the meantime?

 tfliteOptions.addDelegate(nnApiDelegate);
 tfliteOptions.setUseNNAPI(true);

Internal error: Failed to apply delegate: ModifyGraphWithDelegate is disallowed when graph is immutable.

4. How can I set GPU inference using fp16 or fp32? I find the following api has been deprecated.

setAllowFp16PrecisionForFp32(boolean allow);

5. I have tried to quantize my model to 8bit using the following code. But I find that some conv op’s weights are still fp32, some conv op’s weights are int8. How can I solve it ?

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
tflite_quant_model = converter.convert()

Describe the expected behavior

1. solve the not supported log output in logcat. 2. NNAPI 's performance is not slower than cpu. 3. use proper API to enable NNAPI delegate. 4. I can set to use fp16 or fp32 in GPU delegate. 5. quantize model to 8bit successfully.

Thanks very much, looking forward to your reply.

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 28 (14 by maintainers)

Most upvoted comments

@freedomtan I fount another difference:

I tested the mobilenet_v1_1.0_224_quant.tflite and efficientnet-lite0-int8.tflite on SnapDragon 855+.

This is the logout of mobilenet_v1_1.0_224_quant.tflite

Explicitly applied NNAPI delegate, and the model graph will be completely executed by the delegate.

This is the logout of efficientnet-lite0-int8.tflite

Explicitly applied NNAPI delegate, and the model graph will be partially executed by the delegate w/ 1 delegate kernels.

What does the w/1 delegate kernels mean? Initially, I think that it is only executed by 1 delegate kernel.

The only difference in the benchmark log of two model is also about Quantize. mobilenet_v1_1.0_224_quant model’s input type, output type and inference type are all uint8. But efficientnet-lite0-int8.tflite model’s input type and output type are unit8, and inference type is int8. So efficientnet-int8 model has a extra Quantize op than Mobilenet_quant.

However, the Quantize op (uint8->int8) seems not time-consuming.

Does tflite have documents for which op can be executed by dsp, hta, npu, gpu or cpu in detail? The listed ops in nnapi documents are mostly supported by nnapi-cpu and gpu. But we donnot know which op can be executed by dsp or hta, except the logcat’s output.

yizhaoyanbo on Sep 30, 2020