onnxruntime: [ONNXRuntimeError] TensorRT EP could not build Engine for fused node

Describe the bug After successfully converting the model into ONNX format and successfully running symbolic_shape_infer.py script after the fix #3353, TRT engine build starts. Unfortunately, it throws these errors:

2020-04-01 13:27:24.033733574 [W:onnxruntime:Default, tensorrt_execution_provider.h:35 log] [2020-04-01 12:27:24 ERROR] Layer: (Unnamed Layer* 24)[Select]'s output can not be used as shape tensor.

2020-04-01 13:27:24.033755356 [W:onnxruntime:Default, tensorrt_execution_provider.h:35 log] [2020-04-01 12:27:24 ERROR] Network validation failed.

Previous error message is shown in the jupyter-notebook terminal, while this error is shown in the notebook: EPFail: [ONNXRuntimeError] : 11 : EP_FAIL : TensorRT EP could not build Engine for fused node: TensorrtExecutionProvider_TRTKernel_6_6.

Can someone help us with resolving this error?

Urgency Urgent

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • ONNX Runtime installed from (source or binary): binary
  • ONNX Runtime version: 1.2.0
  • Python version: 3.6
  • CUDA/cuDNN version: 10.0 /
  • GPU model and memory: GeForce 940MX / 4GB

To Reproduce Model that is optimized and shape inferred can be found here: https://drive.google.com/open?id=1Rc4nXmLGMDmWlx-X_KtIN07FkMuNYyJ_

Expected behavior Expecting that after the successful conversion and shape inference, the TRT engine will be successfully built.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 25 (17 by maintainers)

Most upvoted comments

I implemented a converter for the ‘overpad and slice’ idea from above. Works a lot faster and gets rid of the asym padding problem:

Times: Batchsize 1: 6ms (speedup of 4.5) Batchsize 4: 17ms (speedup of 5.6)

still losing on bigger batches compared to TF, but the results are resonable.

This method could be implemented in the CUDA executor on loading, instead of the warning and the CPU switch.

Regarding TRT, I run your model and saw the same issue you posted. Thanks for raising it up. I will dig further into it as soon as I can.

Hi @stevenlix, @jywu-msft,

do you have any update with the TRT conversion?

sorry for the delay. we’re in the midst of a release so have been very busy. will sync with @stevenlix to see if we can find some spare cycles to take a closer look. the asymmetric padding/fall back to CPU with onnxruntime CUDA has been a longstanding problem. my understanding was that it was a limitation with cuDNN. It would be good if we can address with TRT (or take a look at how Tensorflow-GPU is handling it and do something similar)

I got the same error on (https://github.com/tensorflow/models/blob/v1.13.0/research/object_detection/models/ssd_mobilenet_v2_feature_extractor.py)

onnx-runtime-trt was build from master.

Looking forward to any update on that.

Furthermore I get a got a lot warnings on startup, this was not the case with version 1.2: e.g.: 2020-04-22 12:12:28.078811862 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘const_slice__618’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.078820662 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘Preprocessor/mul/x:0’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.078831417 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘ConvBnFusion_BN_B_BoxPredictor_5/BoxEncodingPredictor_depthwise/BatchNorm/beta/read/_72__cf__72:0_139’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.078841055 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘ConvBnFusion_W_const_fold_opt__947_148’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081366735 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:28 WARNING] Tensor DataType is determined at build time for tensors not marked as input or output. 2020-04-22 12:12:28.081385383 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:28 WARNING] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result. 2020-04-22 12:12:28.081399211 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:28 WARNING] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result. 2020-04-22 12:12:28.081573111 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘const_fold_opt__971’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081585333 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/sub_5/x:0’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081593714 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘const_fold_opt__928’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081602793 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘largest_int_val__809’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081610546 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘const__737’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081619974 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘const_slice__785’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081630101 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/Select_1/e:0’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081639948 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/zeros_6/_423__cf__423:0’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081646234 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘const_slice__697’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:28.081656081 [W:onnxruntime:, graph.cc:2422 CleanUnusedInitializers] Removing initializer ‘FeatureExtractor/MobilenetV2/expanded_conv_2/depthwise/Relu6_min__79’. It is not used by any node and should be removed from the model. 2020-04-22 12:12:47.796120281 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:47 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:48.192992803 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:48 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:48.277763158 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:48 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:48.695730598 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:48 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:49.084085549 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:49 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:50.605779798 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:50 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:50.668077020 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:50 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:50.774508204 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:50 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:51.121557838 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:51 WARNING] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 2020-04-22 12:12:51.122854830 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:51 WARNING] Tensor DataType is determined at build time for tensors not marked as input or output. 2020-04-22 12:12:51.122938498 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:51 ERROR] Layer: (Unnamed Layer* 24)[Select]'s output can not be used as shape tensor. 2020-04-22 12:12:51.122960637 [W:onnxruntime:Default, tensorrt_execution_provider.h:36 log] [2020-04-22 12:12:51 ERROR] Network validation failed.