tensorflow: Error trying tensorflow litem operations are not supported by GPU delegate

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: Yes, aarch64, android 8.1
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 1.12.2
  • Python version: 2.7
  • Bazel version (if compiling from source): 0.22.0
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with: 1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" 2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the current behavior I’m trying to run a c++ demo of tflite+opengles on my aarch64 android 8.1 board. I have tried to build benchmark_model tool with bazel, it ran the deeplabv3_257_mv_gpu.tflite model successfully on my device.

Now I want to integrate a simple demo of tflite (similar to benchmark) to my codes, which is built by cmake. I spent some time extract all the static libs of tensorflowlite from bazle-out folder and linked them in my cmake. I built the code successfully with ndk standalone toolchain r17c.

But when I ran this new demo on my device it shows me errors like:

`INFO: Created TensorFlow Lite delegate for GPU. Apply delegate for GPU Next operations are not supported by GPU delegate: AVERAGE_POOL_2D: Expected 1 input tensor(s), but node has 0 runtime input(s). CONV_2D: Expected 1 input tensor(s), but node has 0 runtime input(s). CONV_2D: Expected 1 input tensor(s), but node has 2 runtime input(s). CONV_2D: Expected 1 input tensor(s), but node has 3 runtime input(s). DEPTHWISE_CONV_2D: Expected 1 input tensor(s), but node has 2 runtime input(s). DEPTHWISE_CONV_2D: Expected 1 input tensor(s), but node has 3 runtime input(s). RESIZE_BILINEAR: Expected 1 input tensor(s), but node has 2 runtime input(s). First 1 operations will run on the GPU, and the remaining 69 on the CPU. TfLiteGpuDelegate Prepare: ReadValue: value is a constant tensor: 183 Node number 70 (TfLiteGpuDelegate) failed to prepare.

Failed to apply GPU delegate. Delegate setting done Node number 70 (TfLiteGpuDelegate) failed to prepare.

Failed to allocate tensors! ` The model here is still deeplabv3_257_mv_gpu.tflite, which have been proved working on my device. I also have tried to build my new demo code in bazel, which perform correctly.

Describe the expected behavior Build Tflite + opengl delegate successfully with Cmake and perform correctly on my aarch64 board.

Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. use your official benchmark_model.cc

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

@sirius0503

Yeah, as you noted, the properties of the input tensors, the number of the input tensors, and the order of the input tensors is very important for the GPU delegate. So for example, if you are trying to feed in some constant tensor just to see whether things produce the expected number, you’re not gonna have a good time, because you’ll get something like:

Expected 1 input tensor(s), but node has 0 runtime input(s).

You should properly have a placeholder and fill it in, rather than feeding a constant tensor.

Also, it’s the same with the weights and biases. They are supposed to be holding the content after training, so they should be read only constant tensors. If you want to check whether things work, and make them runtime tensors, you will start seeing messages like:

Expected 1 input tensor(s), but node has 2 runtime input(s).

MEAN doesn’t compute the mean of multiple tensors, but just inside a single tensor. It’s added for instance normalization. We probably haven’t studied MEAN too well and didn’t anticipate that use case (maybe it’s a valid thing in TFLite).

I don’t think we ever implemented L2_NORMALIZATION, SUM, and TOPK_V2.

@sirius0503

how did my weights and biases become runtime tensors, can you explain.

Uh, no idea. I am not really familiar with training & converting; I usually only work with finished TFLite files. I wonder whether you set up the TF graph wrong, so that it propagated all the way down to TFLite. Or it could be that there is a bug in TOCO (which I doubt, as everyone must be screaming by now).

are you saying a gpu implementation

That is correct. GPU shader implementation 😃

number of inputs

As a hypothetical example, our ADD implementation currently only takes in 1 or 2 input tensors. I think in TFLite, you can do more, but with TFLite GPU, if you do more, it will probably say it can’t handle it.

order of the input tensors

As an example, the input tensors of CONV_2D must be 0: input tensor, 1: weight tensor, and 2: optional bias tensor. You can’t change thh order to, e.g., 0: bias tensor, 1: input tensor, 2: weight tensor, and hope that GPU can figure out the stuff by its own by looking at the tensor dimensions.

@lwu025

The shader code is very sensitive to what a runtime tensor is and what a constant weight / bias is. You might have changed some constant tensors as runtime tensors so that you can easily inspect tensor content, but that’s not compatible with the GPU backend at this point. From:

AVERAGE_POOL_2D: Expected 1 input tensor(s), but node has 0 runtime input(s). CONV_2D: Expected 1 input tensor(s), but node has 0 runtime input(s). CONV_2D: Expected 1 input tensor(s), but node has 2 runtime input(s). CONV_2D: Expected 1 input tensor(s), but node has 3 runtime input(s). DEPTHWISE_CONV_2D: Expected 1 input tensor(s), but node has 2 runtime input(s). DEPTHWISE_CONV_2D: Expected 1 input tensor(s), but node has 3 runtime input(s). RESIZE_BILINEAR: Expected 1 input tensor(s), but node has 2 runtime input(s).

shows that there’s bunch of wrongly connected things, e.g. what’s an AVERAGE_POOL_2D or CONV_2D useful if there’s no runtime tensor?