edgetpu: Quantize node not being converted
Hi,
I am trying to convert a simple Keras model from TF 2.0.
I follwed the “Full integer quantization of weights and activations” tutorial from the doc
def representative_dataset_gen():
for i in range(100):
yield [x_train[i, None]]
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
The conversion goes well and results in the following graph in Netron:

The edgetpu compilation is successful but states that the quantize nodes will be mapped to the CPU.
Edge TPU Compiler version 2.0.267685300
Input: /mnt/localtmp/model.tflite
Output: /mnt/localtmp/model_edgetpu.tflite
Operator Count Status
SOFTMAX 1 Mapped to Edge TPU
FULLY_CONNECTED 1 Mapped to Edge TPU
QUANTIZE 1 Operation is otherwise supported, but not mapped due to some unspecified limitation
CONV_2D 4 Mapped to Edge TPU
DEQUANTIZE 1 Operation is working on an unsupported data type
However if I try to run this model in python it either throws Cannot cast array data from dtype('float32') to dtype('uint8') according to the rule 'safe' for float input or segfaults for uint8 input.
I see that all the test models do not have a quantization node but instead have a quantization information on their input nodes (seen in Netron) quantization: -1 ≤ 0.0078125 * (q - 128) ≤ 0.9921875.
Am I doing something wrong?
Is the edgetpu_compiler compatible with post training quantization from TF 2.0?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 5
- Comments: 18
@mrharicot, for now possible workaround for this issue is to use
tf.compat.v1.lite.TFLiteConverter.from_keras_model_file()I had success compiling models build with TF 2.0 Keras API to run fully on edgetpu with INT8 input and output.
Nice, it’s because of
*.compat.v1.*Thanks!@Namburger
Gave it a quick try, model compiles without a problem with float i/o (Edge TPU Compiler version 2.1.302470888) which makes
compat.v1workaround obsolete. Also managed to run this model withtflite_runtime-2.1.0and it works perfectly. Try it, I’ve prepared a demo.@xadrianzetx Thanks! I can’t believe that simply using the v1 compat function but keeping the v2 semantics worked. I had tried one or the other, not mixing the two 😄
@DLMasterCat
Don’t mind nightly version of tf, I had to use it at time, since
ResizeNearestNeighborop (used intf.keras.layers.UpSampling2D) was missing from 2.0 release. Since then, ops set has been updated.As for your question, Edge TPU docs states, that unsigned int I/O is required, but I tested it out with int8 and to my surprise, MobileNetV2 compiles without a problem. Did not test it on Edge TPU though, so I would not be surprised if model outputs did not make any sense.
@mrharicot With the release of TF 2.0, the attributes
inference_input_typeandinference_output_typehave been removed (https://www.tensorflow.org/lite/convert/python_api). It seems tflite models can only take floats, and convert them to uint8s internally.@Namburger Maybe you can provide further details. Personally, I run into similar problems when using TF 2.0. It seems that your functions (especially RunInference in model_utils.cc) expect uint8 data, but as soon as they try to set the input tensors, an error is thrown due to the model expecting float data types.
If this is a known problem, I could create a pull request that addresses this issue.