TensorRT: [REFERENCE] KeyError: 'mrcnn_mask_bn4/batchnorm/mul_1' in running sampleUffMaskRCNN demo
while I try to run the maskrcnn demo following this page,
Ubuntu 16.04.6 CUDA 10.1.168 tensorrt 5.1.5.0 uff 0.6.3
Traceback (most recent call last):
File "mrcnn_to_trt_single.py", line 165, in <module>
main()
File "mrcnn_to_trt_single.py", line 123, in main
text=True, list_nodes=list_nodes)
File "mrcnn_to_trt_single.py", line 158, in convert_model
debug_mode = False
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 233, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 108, in from_tensorflow
pre.preprocess(dynamic_graph)
File "./config.py", line 123, in preprocess
connect(dynamic_graph, timedistributed_connect_pairs)
File "./config.py", line 113, in connect
if node_a_name not in dynamic_graph.node_map[node_b_name].input:
KeyError: 'mrcnn_mask_bn4/batchnorm/mul_1'
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 36
Okay @doomb007 I was able to run the sample.
Possible solutions when using CUDA 10.1:
Solution 1
Use
nvcr.io/nvidia/tensorflow:19.10-py3
- this has TensorFlow 1.14 built for CUDA 10.1, unlike the currentpip
packages that aren’t working as mentioned in #132.Then inside the container:
sampleUffMaskRCNN specific
Click to expand Copying the *.ppm data to host
Click to expand successful UFF Parsing output
Click to expand successful sample run output
Running the sample
Solution 2
Build tensorflow from source for CUDA 10.1: https://github.com/tensorflow/tensorflow/issues/26150#issuecomment-506807444
(I didn’t test this)
Solution 3
If using CUDA 10.1, I think downgrading to CUDA 10.0 first and then using the pip package:
pip install tensorflow-gpu==1.14
should work, until Google releases pip package binaries for CUDA 10.1See note in README: https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleUffMaskRCNN#known-issues
(I didn’t test this)
Maybe the keras version is not 2.1.3. I download the latest keras 2.3.x and have the same error as you. Now I change the keras version to 2.1.3, everything is ok
@rmccorm4 Sorry for my late reply, just finished my weekend, I’m testing the MaskRCNN sample now, will tell you if there is any progress.
Hi @doomb007,
All of the samples are built and placed into
$TRT_RELEASE/bin
from the start when you do:I’ll see if I can fix the README on that sample, because that part that you referenced seems misleading.
I’ve updated my comment above (https://github.com/NVIDIA/TensorRT/issues/123#issuecomment-551269792) to do the entire pipeline:
Note, that we’re using that Tensorflow container only because of the CUDA 10.1 restriction. Using CUDA 10.0 shouldn’t be as cumbersome. And once there is a pip package for Tensorflow 1.14 compatible with CUDA 10.1, it should also be easier.
But since we’re using the Tensorflow container above, it doesn’t come with
$TRT_RELEASE/data/faster-rcnn
that’s mentioned in the sample. So I also added an expandable section on how to grab that data from the TensorRT container and copy it over.Click to expand Copying the *.ppm data to host
@rmccorm4 yes, It works, The format of the MaskRCNN model was converted to uff.
A tip: After installing the TensorRT resource following the tutorial, don’t arbitrarily move the internal folders and files out of the main folder, it may also cause the error as this:
Hi @doomb007, @guods
I was able to reproduce your error with the following:
Same error:
However, this happens when you don’t apply the patch to MRCNN as mentioned in the instructions:
In case the above fails and gives you an error from git about your config like so:
You can set your actual config or some dummy config, it doesn’t matter:
However, now I’m hitting the same error as mentioned here: https://github.com/NVIDIA/TensorRT/issues/132#issuecomment-549188744
I think it’s because I have CUDA 10.1 on my host and there is still some incompatibility with TensorFlow at the moment. Looking into it now.