tensorflow: Missing TRANSPOSE Op Kernel

System information

Linux Ubuntu 20.04:
TensorFlow installed from (source or binary): binary
TensorFlow version (or github SHA if from source): tf-nightly==2.4.0.dev20200917

I’m attempting to use a TFLite converted model, which was created and trained using TF2 + Keras. The converter successfully created the TFLite file, and I’ve loaded it into a micro-controller app, as a flatbuffer cpp + h file.

I’m unable to share the model at this time due to confidentiality, however the model contains Conv2D, BatchNormalization, ReLu, MaxPooling2D, Permute, Dropout, Flatten, Dense and Softmax.

After conversion, the model is loaded into an Arduino sketch, but upon loading the model, an error is reported.

8 bytes lost due to alignment. To avoid this loss, please make sure the tensor_arena is 16 bytes aligned.
Didn't find op for builtin opcode 'TRANSPOSE' version '2'

Failed to get registration from op code TRANSPOSE
 
Failed starting model allocation.

Given that this operation was chosen from the Builtin operation set, I believe this is a fault/bug. Can you please advise?

type registration

const tflite::Model* model = nullptr;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
TfLiteTensor* output = nullptr;
alignas(16) uint8_t tensor_arena[kTensorArenaSize]

AllOps missing TRANSPOSE

static tflite::AllOpsResolver resolver;  // NO TRANSPOSE kernel registration

MicroMutableOpResolver missing TRANSPOSE registration

tflite::MicroMutableOpResolver<6> resolver;
resolver.AddConv2D();    
resolver.AddDepthwiseConv2D();
resolver.AddFullyConnected();
resolver.AddReshape();
resolver.AddSoftmax();
resolver.AddBuiltin(tflite::BuiltinOperator_MAX_POOL_2D    
    ,tflite::ops::micro::Register_MAX_POOL_2D()); 
// resolver.AddBuiltin(tflite::BuiltinOperator_TRANSPOSE,
//     ,tflite::ops::micro::Register_TRANSPOSE); //BuiltinOperator_TRANSPOSE exists, but no Register_TRANSPOSE exists

Standalone code to reproduce the issue

 def model_to_tflite(self, features_path = None, tflite_path = None):
        '''Converts a Keras model into a TFLite model'''
        assert self.model is not None, 'TFLite conversion requires the model be loaded'
        assert self.x_data is not None and self.y_data is not None, 'Sample data must be loaded'

        if os.path.exists(tflite_path):
            logging.warning(f'TFLite file already exists: {tflite_path}')

        logging.info(f'Found {len(self.x_data)} features')

        # Construction of a representative dataset
        def representative_dataset():
            for i in range(len(self.x_data)):
                yield([self.x_data[i:i+1,:,:,:]])

        # Construction of a TFLite converter
        converter = tf.lite.TFLiteConverter.from_keras_model(self.model)
        converter.representative_dataset = representative_dataset
        
        converter.optimizations = [ tf.lite.Optimize.OPTIMIZE_FOR_LATENCY ]
        converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
        
        converter.inference_input_type = tf.int8
        converter.inference_output_type = tf.int8
        
        tflite_model = converter.convert()
        bytes_written = open(tflite_path, 'wb').write(tflite_model)

        return bytes_written

Any other info / logs

/home/ian/Documents/source/acdnet_pipeline/venv/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:2289: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`Model.state_updates` will be removed in a future version. '
2020-09-23 14:34:59.654363: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
/home/ian/Documents/source/acdnet_pipeline/venv/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:1376: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`layer.updates` will be removed in a future version. '
2020-09-23 14:35:02.599832: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-23 14:35:02.600152: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2020-09-23 14:35:02.600313: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-09-23 14:35:02.600650: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2020-09-23 14:35:02.600775: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-23 14:35:02.601273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.94GiB deviceMemoryBandwidth: 178.99GiB/s
2020-09-23 14:35:02.601398: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2020-09-23 14:35:02.601464: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2020-09-23 14:35:02.601493: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-23 14:35:02.601521: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-23 14:35:02.601531: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-23 14:35:02.601622: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2020-09-23 14:35:02.601715: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2020-09-23 14:35:02.601745: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-09-23 14:35:02.894277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-23 14:35:02.894345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2020-09-23 14:35:02.894368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N 
2020-09-23 14:35:02.919186: I tensorflow/core/platform/profile_utils/cpu_utils.cc:108] CPU Frequency: 2599990000 Hz
2020-09-23 14:35:02.982735: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:872] Optimization results for grappler item: graph_to_optimize
  function_optimizer: function_optimizer did nothing. time = 5.462ms.
  function_optimizer: function_optimizer did nothing. time = 0.003ms.

2020-09-23 14:35:03.203503: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:315] Ignored output_format.
2020-09-23 14:35:03.203549: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:318] Ignored drop_control_dependency.
2020-09-23 14:35:03.469971: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2020-09-23 14:35:03.470367: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-23 14:35:03.471405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.94GiB deviceMemoryBandwidth: 178.99GiB/s
2020-09-23 14:35:03.471730: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2020-09-23 14:35:03.471928: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2020-09-23 14:35:03.471981: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-23 14:35:03.472028: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-23 14:35:03.472067: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-23 14:35:03.472238: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2020-09-23 14:35:03.472414: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2020-09-23 14:35:03.472450: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-09-23 14:35:03.472494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-23 14:35:03.472519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2020-09-23 14:35:03.472540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 2
Comments: 17 (6 by maintainers)

Most upvoted comments

The most stable implementation of Transpose is https://github.com/victorromeo/tensorflow/tree/v2.3.1_transpose

I’m going to rebase this off master, then create a pull request for consideration from the team.

victorromeo on Nov 18, 2020

This is an issue of a missing C++ micro kernel operation, not a core kernel operation and as such is not appropriate for a python colab gist. As mentioned earlier, I’m working on this as a custom c++ operation. Does Colab support C++11 with Bezal compilation?

A Keras TF2 model, when converted to TFLite micro, includes an operation which is not yet supported called TRANSPOSE. This already exists as a TFLite operation (But when used on an ARM microcontroller, this is simply not available).

victorromeo on Oct 5, 2020