model-optimization: Layer up_sampling2d_36: is not supported. ou can quantize this layer by passing a `tfmot.quantization.keras.QuantizeConfig` instance to the `quantize_annotate_layer` API.

Describe the bug I am trying to use quantize_model() to optimise UNET model, which contains UpSampling2D layer and conversion of this layer is not supported by tensorflow_model_optimization right now.

System information MacOS Catalina Version: 10.15.2

TensorFlow installed from (source or binary): binary

TensorFlow version: 2.1.0

TensorFlow Model Optimization version: 0.3.0

Python version: 3.7.4

Describe the expected behavior Successfully quantize UpSampling2D layer.

Describe the current behavior No support right now to quantize UpSampling2D layer.

Code to reproduce the issue Provide a reproducible code that is the bare minimum necessary to generate the problem.

def unet(pretrained_weights = None,input_size = (256,256,1)):

    inputs = Input(shape=input_size)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)

    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)

    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)

    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
    drop4 = Dropout(0.5)(conv4)

    pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
    drop5 = Dropout(0.5)(conv5)

    up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
    merge6 = concatenate([drop4,up6], axis = 3)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)

    up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
    merge7 = concatenate([conv3,up7], axis = 3)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)

    up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(2, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv10 = Conv2D(1, 1, activation = 'sigmoid')(conv9)

    model = Model(inputs = inputs, outputs = conv10)
    model.compile(optimizer = Adam(lr = 1e-4), loss = dice_coef_loss,metrics = [dice_coef])
    
    #model.summary()

    if(pretrained_weights):
    	model.load_weights(pretrained_weights)

    return model

model = unet()
quantize_model = tfmot.quantization.keras.quantize_model
q_aware_model = quantize_model(model)

Additional context

Error produced by this file line no 372.

RuntimeError: Layer up_sampling2d_40:<class ‘tensorflow.python.keras.layers.convolutional.UpSampling2D’> is not supported. You can quantize this layer by passing a tfmot.quantization.keras.QuantizeConfig instance to the quantize_annotate_layer API.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 8
  • Comments: 38 (6 by maintainers)

Most upvoted comments

I met the same problem

I would also like to quantize the UpSampling2D layer. @nutsiepully Any updates on this?

I am facing exactly the same issue today while trying to implement unet for Coral Edge_TPU.

Following are the findings-

Any help in this regard will be greatly appreciated.

I’ve run into this problem as well. I believe this is the only layer stopping QAT for my detection model. Any updates?

Did anyone find a solution for this? I have the same problem.

I met the same problem

@nutsiepully @ianholing I found another way to work around the quantized up-sampling problem. In my case, I switched from UpSampling2D() to tf.image.resize(), because the former had a runtime issue with the resulting FP32 TFLite model using the TFLite GPU Delegate. Regardless, since none of these up-sampling ops (UpSampling2D(), tf.image.resize(), Conv2DTranspose()) seem to be supported by QAT, this workaround may work for all of the above:

import tensorflow_model_optimization as tfmot

def annotate(layer):
    if layer._name.startswith('tf_op_layer_ResizeBilinear'):
        return layer   # pass thru; don't quantize tf.image.resize()
    # quantize everything else
    return tfmot.quantization.keras.quantize_annotate_layer(layer)

annotated_model = tf.keras.models.clone_model(model, clone_function=annotate)

quantized_model = tfmot.quantization.keras.quantize_apply(annotated_model)
quantized_model.summary()

However, this workaround is sub-optimal. Yes, this allows me to create a QAT INT8 TFLite model. But because the up-sampling operation is not quantized, if I run this TFLite model on a Qualcomm SOC DSP using the Hexagon DSP Delegate, everything after the up-sampling operation (“RESIZE_BILINEAR” in my case below, inside a BiFPN) falls back to running on CPU. Excerpt from a benchmark:

msm8996:/data/local/tmp # taskset 0f  ./benchmark_model \
     --use_hexagon=true \
     --input_layer=images \
     --input_layer_shape=1,224,224,1 \
     --graph=test-bifpn-qat-int8.tflite
...
...
Number of nodes executed: 46
============================== Summary by node type ==============================
	             [Node type]	  [count]	  [avg ms]	    [avg %]	    [cdf %]	  [mem KB]	[times called]
	   TfLiteHexagonDelegate	        4	    15.281	    91.811%	    91.811%	     0.000	        4
	         RESIZE_BILINEAR	        6	     0.530	     3.184%	    94.995%	     0.000	        6
	       DEPTHWISE_CONV_2D	        7	     0.337	     2.025%	    97.020%	     0.000	        7
	                     ADD	       12	     0.308	     1.851%	    98.870%	     0.000	       12
	                QUANTIZE	        4	     0.064	     0.385%	    99.255%	     0.000	        4
	              DEQUANTIZE	       10	     0.064	     0.385%	    99.640%	     0.000	       10
	             MAX_POOL_2D	        3	     0.060	     0.360%	   100.000%	     0.000	        3

Timings (microseconds): count=59 first=16822 curr=16583 min=16495 max=16833 avg=16662.9 std=78
Memory (bytes): count=0
46 nodes observed

So my question is: What is the current recommendation for a quantization friendly up-sampling operation (e.g. in a BiFPN, UNet, etc) for TF2 Keras?

Thanks!

Finally I get it working with a workaround:

import tensorflow_model_optimization as tfmot
from tensorflow_model_optimization.python.core.quantization.keras import quantize_config

###   Clone NoOpQuantizeConfig locally   ###
class NoOpQuantizeConfig(quantize_config.QuantizeConfig):
    """QuantizeConfig which does not quantize any part of the layer."""
    def get_weights_and_quantizers(self, layer):
        return []
    def get_activations_and_quantizers(self, layer):
        return []
    def set_quantize_weights(self, layer, quantize_weights):
        pass
    def set_quantize_activations(self, layer, quantize_activations):
        pass
    def get_output_quantizers(self, layer):
        return []
    def get_config(self):
        return {}

quantize_config = NoOpQuantizeConfig()

...

###   Wrap your upsampling layer with this   ###
u = tfmot.quantization.keras.quantize_annotate_layer(
                UpSampling2D(size=size_divider), quantize_config=quantize_config) (layer_input)

...

###     And then just compile and use   ###
with tf.keras.utils.custom_object_scope({'NoOpQuantizeConfig': NoOpQuantizeConfig}):
            quantized_model = tfmot.quantization.keras.quantize_model(generate_model())
            quantized_model.summary()

In my case it is a functional Model, but I’m pretty sure it also works with Sequentials. If everything goes right, you can see in your summary every layer with the “quant_” prefix.

Hi @Mohit-Ak @Craftsman381,

Thanks for the great feedback. We haven’t added support for Upsampling2D and Conv2DTranspose yet. We are ramping up support for layers based on feedback from users. So thank you for that. Most of these layers are quite simple, and haven’t been added due to conversion support.

Upsampling2D doesn’t actually need any QAT handling. You can try passing a NoOpQuantizeConfig to it as the QuantizeConfig and the training should just work. Let us know if the conversion has issues.

As for Conv2DTranspose, we still have to add support. But we can now prioritize it with this feedback. Thanks!

@Xhark I am still having issues with the ResizeBilinear Layer using the tf.image.resize or tf.keras.layers.Resizing. Are these layers going to be addressed too? I checked the PR and the code and did not find anything solving these layers.

Hi @holokai-ai , @EscVM,

Thanks for the detailed feedback and report.

We pulled out UpSampling2D support prior to launch since we hadn’t tested it end to end (including conversion) and wanted to ensure numerical accuracy before releasing it. In the meanwhile, the best way to work around it is to use the approach both of you have used so far - which is providing a custom QuantizeConfig for the layer.

If you provide a config which is correct, and conversion is supported the the code should work and the TFLite model should just run.

For nearest neighbor interpolation, NoOpQuantizeConfig is the correct solution. And the patch above which uncomments the line is enough. For resize bilinear, there needs to be additional work to match the TFL kernel implementation correctly.

I’ll be taking some time out to work on it. If you are interested in contributing a patch, I’m happy to help 😃

@holokai-ai, it’s not clear to me why you are having trouble on your hardware. I was able to run a QAT model with UpSampling2D locally with TFLite interpreter running on CPU. Perhaps, the delegate code isn’t handling ResizeBilinear correctly.

@limbtio Now we supports TransposeConv: https://github.com/tensorflow/model-optimization/commit/4f4e5d02db5424d5cfcb822a8965023bf8d2011e

But it also requires up-to-date TFLite kernel: https://github.com/tensorflow/tensorflow/commit/f98a42d0a09cb71a9d2e6f36a4cbc29389ee5483

Please report a bug if you have any problem with it.

@Craftsman381: FYI if your goal is to run UpSampling2D eventually in TFLite, the TFLite converter may not be able to convert it yet. You’ll need to see if someone has filed an issue on github.com/tensorflow/tensorflow and request for it.

In general, the comments from testQuantizeSingleLayer_ProducesFullIntegerModel_TF2 in this test file suggest why they have not been added yet in the Keras QAT tool. UpSampling2D is under the comment “# Not done since TFLite converter doesn’t support in TF2 yet”, though things may have changed since so it’s worth doing a quick test first.

Thanks for the great discussion, everyone! This issue has been helpful for my debugging!

As for Conv2DTranspose, we still have to add support. But we can now prioritize it with this feedback. Thanks!

@nutsiepully , Conv2DTranspose would be super useful to me as well (for the same reasons as stated above by @Mohit-Ak). Is there an ETA for adding QAT support? I believe it is already possible to quantize with TFLite and it is supported on EdgeTPU. I would be happy to help with this, but I would need some guidance. Would it be helpful to make a separate feature request issue for Conv2DTranspose QAT support?

The conversion works, but at the edge tpu complier says that the model is not quantized:

@EscVM , this issue discusses setting the input/output type when converting to an integer quantized TFLite model. While it appears that it doesn’t currently support QAT, this comment on converting the input/output to int8or uint8 might help solve that: https://github.com/tensorflow/model-optimization/issues/431#issuecomment-646109974.

@holokai-ai

The problem is that if you need to use the Edge TPU compiler ‘tf.image.resize()’ is not compatible 😃

That’s precisely why I posed the question in this thread to @alanchiao and @nutsiepully

For those of us who need to do edge inference on hardware that requires integer-only ops like EdgeTPU and Hexagon DSP, how can we map to quantization-friendly TFLite up-sampling ops?

Clearly Google has a way to do this (e.g. with the quantized ResizeBilinear in DeepLab v3 segmentation model running on EdgeTPU).

@holokai-ai

How did you wrap ‘tf.image.resize()’ in the Keras model? Thanks

@EscVM – I’m not wrapping tf.image.resize() at the moment; I’m simply calling it intermixed with Keras layers, in a functional style.

Inspecting Google’s quantized EdgeTPU DeepLab v3 segmentation model that @Mohit-Ak mentioned in this thread, looks like the model has a custom MobileNet v2 backbone plus a few ResizeBilinear ops to up-scale the tensors, among other things. Diagram below.

How does one use TFMOT to map to quantization-friendly up-scaling ops like the ones used in this EdgeTPU model?

Thanks

image