tensorflow: Missing Operations for TfLite GPU Delegate

System information

TensorFlow version (you are using): on Android: ‘org.tensorflow:tensorflow-lite:0.0.0-nightly’ and ‘org.tensorflow:tensorflow-lite-gpu:0.0.0-nightly’
Are you willing to contribute it (Yes/No): Yes to the best of my possibilities

Describe the feature and the current behavior/state. I’m trying to use the GPU delegate for my custom tflite model. Creating the interpreter with the GPU delegate using this code:

 val options = Options()
options.setUseNNAPI(false)
options.setAllowFp16PrecisionForFp32(true)
options.setNumThreads(NUM_THREADS)
val gpuDelegate = GpuDelegate()
options.addDelegate(gpuDelegate)

d.tfLite = Interpreter( loadModelFile(assetManager)!!, options)

results in my model being run normally, hence delivering the correct outputs but not being accelerated by the GPU in my opinion, since the execution time is exactly the same as without using the delegate. Adding the line d.tfLite!!.modifyGraphWithDelegate(gpuDelegate) (I don’t know if this is necessary, it would also be nice to know?) results in the following error:

java.lang.IllegalArgumentException: Internal error: Failed to apply delegate: Next operations are not supported by GPU delegate:
    CONV_2D: Max version supported: 1. Requested version 2.
    LOCAL_RESPONSE_NORMALIZATION: Operation is not supported.
    SPLIT: Operation is not supported.
    First 0 operations will run on the GPU, and the remaining 12 on the CPU.ModifyGraphWithDelegate is disallowed when graph is immutable.
    java.lang.RuntimeException: java.lang.IllegalArgumentException: Internal error: Failed to apply delegate: Next operations are not supported by GPU delegate:
    CONV_2D: Max version supported: 1. Requested version 2.
    LOCAL_RESPONSE_NORMALIZATION: Operation is not supported.
    SPLIT: Operation is not supported.
    First 0 operations will run on the GPU, and the remaining 12 on the CPU.ModifyGraphWithDelegate is disallowed when graph is immutable.

So first of all: It would be nice to have those unsupported operations, i.e.

CONV_2D v2
LOCAL_RESPONSE_NORMALIZATION
SPLIT

Second of all: what does ModifyGraphWithDelegate is disallowed when graph is immutable mean? Do I have to make any changes to my tflite model?

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 18 (11 by maintainers)

Commits related to this issue

Update supported CONV_2D version from 1 to 2. The GPU delegate already supports dilated conv. https://github.com/tensorflow/tensorflow/issues/34679 PiperOrigin-RevId: 283386353 Change-Id: I6eb9edb4... — committed to tensorflow/tensorflow by impjdi 5 years ago

Most upvoted comments

According to the documentation here: https://www.tensorflow.org/lite/performance/gpu_advanced none of the operations I listed above are supported, yet, so this is an ongoing issue for me.

Noltibus on Mar 4, 2022

@Noltibus

Sure thing. This is an open source project, and contributions are welcome 😃

Before you dig into the shader code, I would advise to understand the DHWC4 format (or PHWC4… we renamed the format at one point) first. It’s essentially slicing a HWC tensor into 4-channel slices; that’s used throughout the OpenGL delegate. I’m not so sure about the OpenCL delegate; it has more complicated formats.

Then, when it comes to the shader code, OpenGL delegate uses shader code generation to avoid code duplication, but unfortunately, it makes reading & understanding hard. OpenCL doesn’t use the code generation logic and might be easier to author one. After that, you want to also write some unit tests to make sure things work as intended.

Of course, the easiest is to find the closest op implementation and modifying that 😃

impjdi on Dec 3, 2019