tensorflow: After post-training quantization -- Internal error: Cannot allocate memory for the interpreter

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: No
TensorFlow installed from (source or binary): tf-nightly-1.12.0.dev20180926
TensorFlow version (use command below): tf-nightly-1.12.0.dev20180926
Python version: 3.6
Bazel version (if compiling from source): no
GCC/Compiler version (if compiling from source): no
CUDA/cuDNN version: no
GPU model and memory: no
Exact command to reproduce: see follows

Describe the problem

When I tried to use the quantized model for the example TensorFlow for Poets 2: TFLite Android, I encountered the following problem. beginning of crash 09-26 13:08:10.249 31824-31824/? E/AndroidRuntime: FATAL EXCEPTION: main Process: android.example.com.tflitecamerademo, PID: 31824 java.lang.RuntimeException: Unable to start activity ComponentInfo{android.example.com.tflitecamerademo/com.example.android.tflitecamerademo.CameraActivity}: java.lang.NullPointerException: Internal error: Cannot allocate memory for the interpreter: tensorflow/contrib/lite/kernels/conv.cc:201 filter->type != data_type (3 != 1)Node 4 failed to prepare.

    at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2665)
    at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2726)
    at android.app.ActivityThread.-wrap12(ActivityThread.java)
    at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1477)
    at android.os.Handler.dispatchMessage(Handler.java:102)
    at android.os.Looper.loop(Looper.java:154)
    at android.app.ActivityThread.main(ActivityThread.java:6119)
    at java.lang.reflect.Method.invoke(Native Method)
    at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:886)
    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:776)
 Caused by: java.lang.NullPointerException: Internal error: Cannot allocate memory for the interpreter: tensorflow/contrib/lite/kernels/conv.cc:201 filter->type != data_type (3 != 1)Node 4 failed to prepare.

    at org.tensorflow.lite.NativeInterpreterWrapper.createInterpreter(Native Method)
    at org.tensorflow.lite.NativeInterpreterWrapper.<init>(NativeInterpreterWrapper.java:75)
    at org.tensorflow.lite.NativeInterpreterWrapper.<init>(NativeInterpreterWrapper.java:54)
    at org.tensorflow.lite.Interpreter.<init>(Interpreter.java:114)
    at com.example.android.tflitecamerademo.ImageClassifier.<init>(ImageClassifier.java:97)
    at com.example.android.tflitecamerademo.Camera2BasicFragment.onActivityCreated(Camera2BasicFragment.java:299)
    at android.app.Fragment.performActivityCreated(Fragment.java:2362)
    at android.app.FragmentManagerImpl.moveToState(FragmentManager.java:1014)
    at android.app.FragmentManagerImpl.moveToState(FragmentManager.java:1171)
    at android.app.BackStackRecord.run(BackStackRecord.java:816)
    at android.app.FragmentManagerImpl.execPendingActions(FragmentManager.java:1578)
    at android.app.FragmentController.execPendingActions(FragmentController.java:371)
    at android.app.Activity.performStart(Activity.java:6695)
    at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2628)
    at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2726) 
    at android.app.ActivityThread.-wrap12(ActivityThread.java) 
    at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1477) 
    at android.os.Handler.dispatchMessage(Handler.java:102) 
    at android.os.Looper.loop(Looper.java:154) 
    at android.app.ActivityThread.main(ActivityThread.java:6119) 
    at java.lang.reflect.Method.invoke(Native Method) 
    at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:886) 
    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:776)

The cmds were as follows:

import tensorflow as tf
graph_def_file='retrained_graph.pb'
input_arrays = ["input"]
output_arrays = ["final_result"]
converter = tf.contrib.lite.TocoConverter.from_frozen_graph(graph_def_file, input_arrays, output_arrays)
converter.post_training_quantize = True
tflite_quantized_model = converter.convert()
open("quantized_graph.tflite", "wb").write(tflite_quantized_model)

But if post-training quantization is not enabled, the app works.

The quantized model could work in python env (though the result is not accurate).

interpreter = tf.contrib.lite.Interpreter(model_path="quantized_graph.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
from label_image import read_tensor_from_image_file
input_data = read_tensor_from_image_file('1.jpg',input_height=224,input_width=224, input_mean=128, input_std=128)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Source code / logs

Models are attached. tf_files.zip

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 23 (3 by maintainers)

Most upvoted comments

@minizon I finally figured out what your environment/setting is. Put it simply, the TF Lite runtime used by the TensorFlow for Poets 2: TFLite Android example at this time doesn’t support post-training quantization. Quick solution, build your own tensorflow-lite aar with more recent source code.

More details:

yes, the app you used uses floating-point model, not quantized uint8. I thought you were trying the tflitecamera demo in the TF source code rather than the one in the TensorFlow for Poets 2.
post-training quantization is to quantize weights of convolutions, that is, it’s weights only quantization. Weights are quantized from float32 to uint8. But computation is still in floating point, that is, dequantization should be done before computing. Code related to this dequantization was added relative late. So it’s not in AAR you used.
for full quantization / quantization-aware training see this doc
Since TensorFlow 1.11.0 was released. Maybe there will be updated AAR soon. Maybe @wt-huang can help?

freedomtan on Sep 28, 2018

@freedomtan What I did before is that I just simply replaced the retrained_graph.lite by the quantized_graph.lite in the assets folder (here), the classifier for the tf interpreter was not changed (here). I think this demo is still for the floating point model, but the error ( Internal error: Cannot allocate memory for the interpreter: tensorflow/contrib/lite/kernels/conv.cc:201 filter->type != data_type (3 != 1)Node 4 failed to prepare) did happen. Or do you mean only after post-training quantization, the model is still not proper for android deployment?

Any way, I couldn’t find clear guides for how to change the floating point model into a real quantized one. The official model zoo provides some pre-converted models (here) with no details for the conversion process.

minizon on Sep 28, 2018

@minizon Thank you for your advice. Codes are simple, as follows.

`import tensorflow as tf tf.enable_eager_execution() converter = tf.contrib.lite.TocoConverter.from_frozen_graph(‘resnet_v1_frozen.pb’, [‘input’], [‘output’]) tflite_model = converter.convert() open(“converted_resnet.tflite”, “wb”).write(tflite_model)

converter.post_training_quantize = True tflite_model = converter.convert() open(“quantized_resnet.tflite”, “wb”).write(tflite_model)`

My model is based on ResNet V1, but without BN layers. .pb model can reach an accuracy of 88%, while .tflite nearly none. Am I wrong in the conversion process or has it something to do with my model? Any advice will be appreciated.

chenlh14 on Dec 25, 2018

@damhurmuller

If you use the toco command line tool for generating post-training quantization models, make sure the --inference_type and --inference_input_type are both FLOAT rather than QUANTIZED_UINT8. If you want a full quantized model, go for the quantization aware training flow.

Also you might need to build the TensorflowLite AAR from source because none of the online-available versions supported this feature when I tried them. I used this command:

bazel build --cxxopt='--std=c++11' -c opt        \
--fat_apk_cpu=x86,x86_64,arm64-v8a,armeabi-v7a   \
//tensorflow/lite/java:tensorflow-lite

Make sure to edit the WORKSPACE file so that bazel knows where the android sdk and ndk are.

brucechou1983 on Nov 17, 2018