edgetpu: Quantize node not being converted

Hi,

I am trying to convert a simple Keras model from TF 2.0.

I follwed the “Full integer quantization of weights and activations” tutorial from the doc

def representative_dataset_gen():
    for i in range(100):
        yield [x_train[i, None]]

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

The conversion goes well and results in the following graph in Netron: model tflite

The edgetpu compilation is successful but states that the quantize nodes will be mapped to the CPU.

Edge TPU Compiler version 2.0.267685300
Input: /mnt/localtmp/model.tflite
Output: /mnt/localtmp/model_edgetpu.tflite

Operator                       Count      Status

SOFTMAX                        1          Mapped to Edge TPU
FULLY_CONNECTED                1          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
CONV_2D                        4          Mapped to Edge TPU
DEQUANTIZE                     1          Operation is working on an unsupported data type

However if I try to run this model in python it either throws Cannot cast array data from dtype('float32') to dtype('uint8') according to the rule 'safe' for float input or segfaults for uint8 input.

I see that all the test models do not have a quantization node but instead have a quantization information on their input nodes (seen in Netron) quantization: -1 ≤ 0.0078125 * (q - 128) ≤ 0.9921875.

Am I doing something wrong? Is the edgetpu_compiler compatible with post training quantization from TF 2.0?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 5
  • Comments: 18

Most upvoted comments

@mrharicot, for now possible workaround for this issue is to use tf.compat.v1.lite.TFLiteConverter.from_keras_model_file()

I had success compiling models build with TF 2.0 Keras API to run fully on edgetpu with INT8 input and output.

Nice, it’s because of *.compat.v1.* Thanks!

@Namburger

Gave it a quick try, model compiles without a problem with float i/o (Edge TPU Compiler version 2.1.302470888) which makes compat.v1 workaround obsolete. Also managed to run this model with tflite_runtime-2.1.0 and it works perfectly. Try it, I’ve prepared a demo.

@xadrianzetx Thanks! I can’t believe that simply using the v1 compat function but keeping the v2 semantics worked. I had tried one or the other, not mixing the two 😄

@DLMasterCat

Don’t mind nightly version of tf, I had to use it at time, since ResizeNearestNeighbor op (used in tf.keras.layers.UpSampling2D) was missing from 2.0 release. Since then, ops set has been updated.

As for your question, Edge TPU docs states, that unsigned int I/O is required, but I tested it out with int8 and to my surprise, MobileNetV2 compiles without a problem. Did not test it on Edge TPU though, so I would not be surprised if model outputs did not make any sense.

@mrharicot With the release of TF 2.0, the attributes inference_input_type and inference_output_type have been removed (https://www.tensorflow.org/lite/convert/python_api). It seems tflite models can only take floats, and convert them to uint8s internally.

@Namburger Maybe you can provide further details. Personally, I run into similar problems when using TF 2.0. It seems that your functions (especially RunInference in model_utils.cc) expect uint8 data, but as soon as they try to set the input tensors, an error is thrown due to the model expecting float data types.

If this is a known problem, I could create a pull request that addresses this issue.