tensorflow: Unsupported Full-Integer TensorFlow Lite models in TF 2

Describe the issue In TF2, the full-integer quantized models produced by the TFLite Converter can only have float input and output type. This is a blocker for users who require int8 or uint8 input and/or output type.

UPDATE: We now support this workflow.

End-to-End Tutorial: https://colab.sandbox.google.com/github/google-coral/tutorials/blob/master/retrain_classification_ptq_tf2.ipynb

Only TFLite Conversion: Convert TF Models to TFLite Full-Integer models You can refer to the code here, also given below:

import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
def representative_dataset_gen():
  for _ in range(num_calibration_steps):
    # Get sample input data as a numpy array in a method of your choosing.
    yield [input]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_model = converter.convert()

Only TFLite Inference: Run inference on the TFLite model Note that the one caveat with integer-only models is this – you need to manually map (aka quantize) the float inputs to integer inputs during inference. To understand how this can be done – refer to the equation provided in TensorFlow Lite 8-bit quantization specification document and it’s equivalent code in python below:

import numpy as np
import tensorflow as tf

# Input to the TF model are float values in the range [0, 10] and of size (1, 100)
np.random.seed(0)
tf_input = np.random.uniform(low=0, high=10, size=(1, 100)).astype(np.float32)

# Output of the TF model.
tf_output = keras_model.predict(input)

# Output of the TFLite model.
interpreter = tf.lite.Interpreter(model_content=tflite_model) 
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]
# Manually quantize the input from float to integer
scale, zero_point = input_details['quantization']
tflite_integer_input = tf_input / scale + zero_point
tflite_integer_input = tflite_integer_input.astype(input_details['dtype'])
interpreter.set_tensor(input_details['index'], tflite_integer_input)
interpreter.invoke()
output_details = interpreter.get_output_details()[0]
tflite_integer_output = interpreter.get_tensor(output_details['index'])
# Manually dequantize the output from integer to float
scale, zero_point = output_details['quantization']
tflite_output = tflite_integer_output.astype(np.float32)
tflite_output = (tflite_output - zero_point) * scale
 
# Verify that the TFLite model's output is approximately (expect some loss in 
# accuracy due to quantization) the same as the TF model's output
assert np.allclose(tflite_output, tf_output, atol=1e-04) == True

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 21
Comments: 66 (20 by maintainers)

Links to this issue

machine learning - TensorFlow Lite converter representative_dataset for Conv2D on accelerometer data - Stack Overflow

Most upvoted comments

Update: We now support TensorFlow Lite Full-Integer models in TF 2.0, i.e, with integer (tf.int8 and tf.uint8 types) input and output. Exception: Support for quantize-aware trained models is still in progress

End-to-End Tutorial: https://colab.sandbox.google.com/github/google-coral/tutorials/blob/master/retrain_classification_ptq_tf2.ipynb

Only TFLite Conversion: Convert TF Models to TFLite Full-Integer models You can refer to the code here, also given below:

import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
def representative_dataset_gen():
  for _ in range(num_calibration_steps):
    # Get sample input data as a numpy array in a method of your choosing.
    yield [input]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_model = converter.convert()

import numpy as np
import tensorflow as tf

# Input to the TF model are float values in the range [0, 10] and of size (1, 100)
np.random.seed(0)
tf_input = np.random.uniform(low=0, high=10, size=(1, 100)).astype(np.float32)

# Output of the TF model.
tf_output = keras_model.predict(input)

# Output of the TFLite model.
interpreter = tf.lite.Interpreter(model_content=tflite_model) 
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]
# Manually quantize the input from float to integer
scale, zero_point = input_details['quantization']
tflite_integer_input = tf_input / scale + zero_point
tflite_integer_input = tflite_integer_input.astype(input_details['dtype'])
interpreter.set_tensor(input_details['index'], tflite_integer_input)
interpreter.invoke()
output_details = interpreter.get_output_details()[0]
tflite_integer_output = interpreter.get_tensor(output_details['index'])
# Manually dequantize the output from integer to float
scale, zero_point = output_details['quantization']
tflite_output = tflite_integer_output.astype(np.float32)
tflite_output = (tflite_output - zero_point) * scale
 
# Verify that the TFLite model's output is approximately (expect some loss in 
# accuracy due to quantization) the same as the TF model's output
assert np.allclose(tflite_output, tf_output, atol=1e-04) == True

+10

MeghnaNatraj on Jun 8, 2020

Yes, it is currently a work in progress. We will update this github issue once it’s completed.

On Thu, Apr 23, 2020 at 8:22 PM mahdichtourou24051994 < notifications@github.com> wrote:

Hello , Any updates about this issue ? Will v2 converter support input output Flags?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/tensorflow/issues/38285#issuecomment-618780215, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDCGJ2GOLCDOWIFHCDJ2VDROEAX3ANCNFSM4MCTYA6A .

MeghnaNatraj on Apr 24, 2020

waiting for this issue to be resolved, thanks.

Tony-Tseng on May 12, 2020

Hello, the issue is resolved now with TF 2.4 (nightly). Let us know if you face any issue. (This issue will remain open until we also fix all documentation, and resolve any issues that may arise in the next few days)

MeghnaNatraj on Jul 28, 2020

Note: The following discussion is not related to the current issue of supporting full integer tensorflow lite models, including input and output, in TF 2.0

@dreamPoet No, this is not possible in TensorFlow 2. We cannot create a uint8 inference tflite model and only support int8 inference model. We’ve moved away from the uint8 quantization because with int8 we’re able to use asymmetrical quantization ranges without paying the same penalty. Refer to https://www.tensorflow.org/lite/performance/quantization_spec for more information

For more details: Youtube: https://www.youtube.com/watch?v=-jBmqY_aFwE Slides used in the Youtube Video: https://docs.google.com/presentation/d/1zGm5bqGrkAepwJZ5PABiYjrIKq1pDnzafa8ZYeaFhXY/edit?usp=sharing

MeghnaNatraj on Sep 18, 2020

Thank you and that worked. Here’s my Colab Gist for anyone else that might run into a similar problem: https://colab.research.google.com/gist/sayakpaul/d209ac353d3bcea06287be5e91628f18/scratchpad.ipynb.

sayakpaul on Mar 9, 2021

Remove the line converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] as for QAT you are assuming that the model can be fully quantized.

MeghnaNatraj on Mar 8, 2021

@sayakpaul yes this is perfectly right.

MeghnaNatraj on Mar 3, 2021

By the way, can we use converter to create an uint8 inference tflite model rather than an int8 inference model?

dreamPoet on May 28, 2020