TensorRT: ONNX to TRT Error: Myelin Error in addNodeToMyelinGraph: 0 ... operation not supported within a loop body.
Description
I wrote in Keras a custom model that takes an RGB-Video (i.e. a 4D Tensor) as input to classifiy it.
list_convolved_frames = []
input = tf.keras.Input(shape=(num_frames,*input_shape_frame))
for i in range(num_frames):
out = input[:,i,:,:,:]
out = do_something(out)
out = Lambda(lambda x: tf.keras.backend.expand_dims(x,1))(out)
list_convolved_frames.append(out)
convolved_frames = Concatenate(axis=1)(list_convolved_frames)
out = LSTM(64,return_sequences=False,dropout=dropout_rate)(convolved_frames)
out = Flatten()(out)
out = Dense(2, activation='softmax')(out)
model = tf.keras.Model(inputs=input, outputs=out, name=model_name)
I can successfully convert it to ONNX, however conversion to TensorRT fails with
[02/23/2021-19:13:53] [E] [TRT] ../builder/myelin/codeGenerator.cpp (114) - Myelin Error in addNodeToMyelinGraph: 0 (while/TensorArrayV2Read/TensorListGetItem{StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/zeros_1/Const:0,const_fold_opt__733,__inference_while_cond_45765_532_while/Less,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/while/maximum_iterations:0,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/time:0,Func/StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/input/_43:0,Func/StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/input/_44:0,Func/StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/input/_45:0,while/add_2/y:0,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_9/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_8/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_7/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_6/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_5/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_4/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_3/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_2/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda_1/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lambda/ExpandDims,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/concatenate/concat,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/transpose,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/zeros_1/Const:0_0 + (Unnamed Layer* 531) [Shuffle],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/zeros_1,(Unnamed Layer* 541) [TripLimit],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/while_loop,(Unnamed Layer* 554) [Recurrence],(Unnamed Layer* 556) [Recurrence],(Unnamed Layer* 558) [Recurrence],(Unnamed Layer* 546) [TripLimit],while/add_2,(Unnamed Layer* 565) [Shuffle],while/TensorArrayV2Read/TensorListGetItem,while/MatMul,(Unnamed Layer* 549) [Recurrence],(Unnamed Layer* 550) [Recurrence],while/MatMul_1,while/add,while/BiasAdd,while/split,while/split_1,while/split_2,while/split_3,while/Sigmoid,while/Sigmoid_1,while/Tanh,while/Sigmoid_2,while/mul,while/mul_1,while/add_1,while/Tanh_1,while/mul_2,while/TensorArrayV2Write/TensorListSetItem,(Unnamed Layer* 596) [LoopOutput],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/strided_slice_2,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/lstm/PartitionedCall/strided_slice_2__676 + StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/flatten/Reshape + (Unnamed Layer* 633) [Shuffle],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/dense/MatMul,(Unnamed Layer* 638) [Shuffle],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/dense/BiasAdd/ReadVariableOp:0 + (Unnamed Layer* 640) [Shuffle],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/dense/BiasAdd,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/re_lu_6/Relu,(Unnamed Layer* 649) [Shuffle],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/dense_1/MatMul,(Unnamed Layer* 654) [Shuffle],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/dense_1/BiasAdd/ReadVariableOp:0 + (Unnamed Layer* 656) [Shuffle],StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/dense_1/BiasAdd,StatefulPartitionedCall/CUSTOM_TD_CNN_MODEL_CLASS_02/dense_1/Softmax} operation not supported within a loop body.) [02/23/2021-19:13:53] [E] [TRT] ../builder/myelin/codeGenerator.cpp (114) - Myelin Error in addNodeToMyelinGraph: 0 () [02/23/2021-19:13:53] [E] Engine creation failed [02/23/2021-19:13:53] [E] Engine set up failed
Any idea whats going on?
Btw, I highly suspect that my problem is related to: https://github.com/NVIDIA/TensorRT/issues/411
Environment
TensorRT Version: 7.2.1.4 NVIDIA GPU: RTX 2080 NVIDIA Driver Version: 455.23.05 CUDA Version: 11.1 CUDNN Version: Operating System: Ubuntu 18.04 Python Version (if applicable): Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if so, version): nvcr.io/nvidia/tensorrt:20.10-py3
Relevant Files
Steps To Reproduce
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 23
@joan126 No, still waiting for the bug resolution from NVIDIA/TensorRT.
Hello @ttyio. I upgraded to TensorRT 8.2.0.6 and still have issue with the same simple model conversion
With the following depencencies Python 3.6.9 TensorRT 8.2.0.6 TensorFlow 2.4.0 tf2onnx 1.9.1
When I try converting from ONNX to TRT:
trtexec --optShapes='args_0':1x60x8 --onnx=model_dummy.onnx
The log shows
I have the same logs when I use opset 9 or 10.
Here is the ONNX model: model_dummy.onnx.tar.gz
Hi @ttyio , the steps to reproduce the issue is simple. First here is my dependencies. JetPack 4.5.1 Python 3.6.9 TensorRT 7.1.3 TensorFlow 2.4.0 tf2onnx 1.8.4
Then the code
I use trtexec from TensorRT OSS 7.1.3 to parse/verify the onnx:
trtexec --optShapes='args_0':1x60x8 --onnx=model_dummy.onnx
Which results in
Finally here is the ONNX model:
model_dummy.onnx.gz
@ttyio I have the same issue: converting LSTM+Dense TF to TRT triggers a “Myelin Error in addNodeToMyelinGraph: operation not supported within a loop body”. Does the “triaged” status means that a fix is planned? Thanks.
I’ve also tried to convert my model via TF-TRT and that does work even though I’m using an LSTM layer… However TF-TRT has some significant drawbacks, as I’m aiming for a Jetson Xavier as target platform.
Does anyone know about a working example where a Keras/tensorflow model with LSTM layers is converted to TensorRT?
I’ve found out that removing the LSTM block from the network allowed me to convert to TensorRT! Question is now, what goes wrong with LSTM during TensorRT conversion? To my understanding it is supported. Right?
This issue may have to do with a warning, I’ve got while saving my model during train time in Keras: