mmdeploy: [Bug] the performance of Mask_rcn tensorrt-fp16_dynamic-320x320-1344x1344 is bad

Checklist

  • I have searched related issues but cannot get the expected help.
  • 2. I have read the FAQ documentation but cannot get the expected help.
  • 3. The bug has not been fixed in the latest version.

Describe the bug

Average Precision is low in https://github.com/onnx/models/tree/main/vision/object_detection_segmentation/mask-rcnn mAPis 0,33X but we test result is below

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.196 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.394 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.173 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.009 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.160 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.466 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.278 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.278 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.278 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.027 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.253 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.586

Reproduction

python ./tools/test.py configs/mmdet/instance-seg/instance-seg_tensorrt-fp16_dynamic-320x320-1344x1344.py /data/azuryl/mmdetection_2.27.0/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.py --model /data/azuryl/mmdeploy_model/maskrcnn_f16_d320_1344/end2end.engine --metrics segm --device cuda:0

Environment

2023-07-26 02:09:00,097 - mmdeploy - INFO - 

2023-07-26 02:09:00,098 - mmdeploy - INFO - **********Environmental information**********
2023-07-26 02:09:07,526 - mmdeploy - INFO - sys.platform: linux
2023-07-26 02:09:07,528 - mmdeploy - INFO - Python: 3.6.9 | packaged by conda-forge | (default, Mar  6 2020, 18:58:41) [GCC 7.3.0]
2023-07-26 02:09:07,528 - mmdeploy - INFO - CUDA available: True
2023-07-26 02:09:07,528 - mmdeploy - INFO - GPU 0: Xavier
2023-07-26 02:09:07,529 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda-10.2
2023-07-26 02:09:07,529 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 10.2, V10.2.89
2023-07-26 02:09:07,529 - mmdeploy - INFO - GCC: gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
2023-07-26 02:09:07,529 - mmdeploy - INFO - PyTorch: 1.10.0
2023-07-26 02:09:07,530 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.5
  - C++ Version: 201402
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: NO AVX
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_53,code=sm_53;-gencode;arch=compute_62,code=sm_62;-gencode;arch=compute_72,code=sm_72
  - CuDNN 8.0
  - Build settings: BLAS_INFO=open, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=8.0.0, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -DMISSING_ARM_VST1 -DMISSING_ARM_VLD1 -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=open, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=ON, USE_NCCL=0, USE_NNPACK=ON, USE_OPENMP=ON, 

2023-07-26 02:09:07,530 - mmdeploy - INFO - TorchVision: 0.11.1
2023-07-26 02:09:07,530 - mmdeploy - INFO - OpenCV: 4.8.0
2023-07-26 02:09:07,530 - mmdeploy - INFO - MMCV: 1.3.17
2023-07-26 02:09:07,531 - mmdeploy - INFO - MMCV Compiler: GCC 7.5
2023-07-26 02:09:07,531 - mmdeploy - INFO - MMCV CUDA Compiler: 10.2
2023-07-26 02:09:07,531 - mmdeploy - INFO - MMDeploy: 0.7.0+83b11bc
2023-07-26 02:09:07,531 - mmdeploy - INFO - 

2023-07-26 02:09:07,531 - mmdeploy - INFO - **********Backend information**********
2023-07-26 02:09:10,734 - mmdeploy - INFO - onnxruntime: None	ops_is_avaliable : False
2023-07-26 02:09:10,896 - mmdeploy - INFO - tensorrt: 7.1.3.0	ops_is_avaliable : True
2023-07-26 02:09:10,979 - mmdeploy - INFO - ncnn: None	ops_is_avaliable : False
2023-07-26 02:09:10,984 - mmdeploy - INFO - pplnn_is_avaliable: False
2023-07-26 02:09:10,989 - mmdeploy - INFO - openvino_is_avaliable: False
2023-07-26 02:09:11,078 - mmdeploy - INFO - snpe_is_available: False
2023-07-26 02:09:11,079 - mmdeploy - INFO - 

2023-07-26 02:09:11,079 - mmdeploy - INFO - **********Codebase information**********
2023-07-26 02:09:11,087 - mmdeploy - INFO - mmdet:	2.27.0
2023-07-26 02:09:11,088 - mmdeploy - INFO - mmseg:	None
2023-07-26 02:09:11,088 - mmdeploy - INFO - mmcls:	None
2023-07-26 02:09:11,088 - mmdeploy - INFO - mmocr:	None
2023-07-26 02:09:11,088 - mmdeploy - INFO - mmedit:	None
2023-07-26 02:09:11,089 - mmdeploy - INFO - mmdet3d:	None
2023-07-26 02:09:11,089 - mmdeploy - INFO - mmpose:	None
2023-07-26 02:09:11,089 - mmdeploy - INFO - mmrotate:	None

Error traceback

ython ./tools/test.py configs/mmdet/instance-seg/instance-seg_tensorrt-fp16_dynamic-320x320-1344x1344.py /data/azuryl/mmdetection_2.27.0/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.py --model /data/azuryl/mmdeploy_model/maskrcnn_f16_d320_1344/end2end.engine --metrics segm --device cuda:0
/data/azuryl/mmdetection_2.27.0/mmdet/datasets/utils.py:70: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
'data pipeline in your config file.', UserWarning)
loading annotations into memory...
Done (t=2.09s)
creating index...
index created!
2021-07-23 02:54:25,221 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /data/azuryl/mmdeploy_0.7.0/mmdeploy/lib/libmmdeploy_tensorrt_ops.so
2021-07-23 02:54:25,222 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /data/azuryl/mmdeploy_0.7.0/mmdeploy/lib/libmmdeploy_tensorrt_ops.so
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 4952/4952, 4.0 task/s, elapsed: 1242s, ETA: 0s
Evaluating segm...
/data/azuryl/mmdetection_2.27.0/mmdet/datasets/coco.py:474: UserWarning: The key "bbox" is deleted for more accurate mask AP of small/medium/large instances since v2.12.0. This does not change the overall mAP calculation.
UserWarning)
Loading and preparing results...
DONE (t=8.15s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type segm
DONE (t=154.80s).
Accumulating evaluation results...
DONE (t=21.92s).

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.196
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.394
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.173
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.009
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.160
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.466
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.253
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.586

2021-07-23 03:19:01,735 - test - INFO - OrderedDict([('segm_mAP', 0.196), ('segm_mAP_50', 0.394), ('segm_mAP_75', 0.173), ('segm_mAP_s', 0.009), ('segm_mAP_m', 0.16), ('segm_mAP_l', 0.466), ('segm_mAP_copypaste', '0.196 0.394 0.173 0.009 0.160 0.466')])

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 27

Most upvoted comments

It seems the tensorrt version is old.

Please try to comment these lines in /data/azuryl/mmdeploy_1.2.0/mmdeploy/backend/tensorrt/wrapper.py

https://github.com/open-mmlab/mmdeploy/blob/v1.2.0/mmdeploy/backend/tensorrt/wrapper.py#L85 https://github.com/open-mmlab/mmdeploy/blob/v1.2.0/mmdeploy/backend/tensorrt/wrapper.py#L96-L97

Currently I don’t have env of jetpack 4.5.1 for test. 😂

Since your tensorrt version is different with mine, I can’t use your model for test. I will convert your onnx model to tensorrt and check the metrics.

sudo apt-get install libgeos-dev I had tried first but no use. now I hav conda install -c conda-forge libstdcxx-ng installed libstdcxx-ng-12.2.0 conda install geos success installed geos-3.11.1

I will try next step

Thank you very much

I suggest you use sudo apt-get install libgeos-dev. If the error still exists when you convert the model, then you can try to install geos with conda.

The error is due to mmdection

File “/data/azuryl/mmdetection_3.1.0/mmdet/models/backbones/init.py”, line 2, in from .csp_darknet import CSPDarknet File “/data/azuryl/mmdetection_3.1.0/mmdet/models/backbones/csp_darknet.py”, line 11, in from …layers import CSPLayer File “/data/azuryl/mmdetection_3.1.0/mmdet/models/layers/init.py”, line 3, in from .bbox_nms import fast_nms, multiclass_nms File “/data/azuryl/mmdetection_3.1.0/mmdet/models/layers/bbox_nms.py”, line 8, in from mmdet.structures.bbox import bbox_overlaps File “/data/azuryl/mmdetection_3.1.0/mmdet/structures/bbox/init.py”, line 2, in from .base_boxes import BaseBoxes File “/data/azuryl/mmdetection_3.1.0/mmdet/structures/bbox/base_boxes.py”, line 9, in from mmdet.structures.mask.structures import BitmapMasks, PolygonMasks File “/data/azuryl/mmdetection_3.1.0/mmdet/structures/mask/init.py”, line 3, in from .structures import (BaseInstanceMasks, BitmapMasks, PolygonMasks, File “/data/azuryl/mmdetection_3.1.0/mmdet/structures/mask/structures.py”, line 10, in import shapely.geometry as geometry File “/data/azuryl/archiconda3/envs/mmdeploy_1.0.0rc1/lib/python3.6/site-packages/shapely/geometry/init.py”, line 4, in from .base import CAP_STYLE, JOIN_STYLE File “/data/azuryl/archiconda3/envs/mmdeploy_1.0.0rc1/lib/python3.6/site-packages/shapely/geometry/base.py”, line 20, in from shapely.coords import CoordinateSequence File “/data/azuryl/archiconda3/envs/mmdeploy_1.0.0rc1/lib/python3.6/site-packages/shapely/coords.py”, line 10, in from shapely.geos import lgeos File “/data/azuryl/archiconda3/envs/mmdeploy_1.0.0rc1/lib/python3.6/site-packages/shapely/geos.py”, line 94, in _lgeos = CDLL(os.path.join(sys.prefix, ‘lib’, ‘libgeos_c.so’)) File “/data/azuryl/archiconda3/envs/mmdeploy_1.0.0rc1/lib/python3.6/ctypes/init.py”, line 348, in init self._handle = _dlopen(self._name, mode) OSError: /data/azuryl/archiconda3/envs/mmdeploy_1.0.0rc1/lib/libgeos_c.so: cannot open shared object file: No such file or directory

This issue may helps you

https://github.com/open-mmlab/mmdeploy/issues/2295

My bad, I forget you used jetson, not ubuntu x86_64.

The error is due to nvcc doesn’t support nested namespaces.

// replace  
namespace mmdeploy::mmaction::cuda {
}
// with
namespace mmdeploy {
namespace mmaction {
namespace cuda {
}
}
}

As there are many fixed after 1.0.0rc1, you should use the latest code instead of 1.0.0rc1. You shouldn’t meet the nested namespaces with latest(v1.2.0) code.

I had tried long times with https://github.com/open-mmlab/mmdeploy/issues/2260 since my project is developed in ubuntu18.04 in jeston ADX xavier, my device can not use main branch, my device only can use jetpack 4.5.1

are you using jestopm AGX xavifer ?

I used ubuntu, not jetson. While, I can install mmdeploy 1.2.0 and related packages on jetson 4.6. There are my installed packages. env.txt

What difficulties have you met ? For installation of mmcv >= 2.0.0 on jetson with python 3.6, you can:

# 1
git clone https://github.com/open-mmlab/mmcv.git
# 2 edit setup.py, remove python_requires='>=3.7'
# 3 build from source, you may need install build tools with `conda install -c conda-forge gxx_linux-aarch64`
pip install .

which command line you used, why your result have index created! two times? why your result have " mmengine - INFO - Epoch(test) [5000/5000] "?

My conversion command is same with you, except I used mmdeploy main branch and mmdet 3.x

python ./tools/deploy.py \
  configs/mmdet/instance-seg/instance-seg_tensorrt-fp16_dynamic-320x320-1344x1344.py \
  ../mmdetection/configs/mask_rcnn/mask-rcnn_r50_fpn_2x_coco.py \
 https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_2x_coco/mask_rcnn_r50_fpn_2x_coco_bbox_mAP-0.392__segm_mAP-0.354_20200505_003907-3e542a40.pth \
  ../mmdetection/demo/demo.jpg \
  --work-dir maskrcnn_f16_d320_1344 \
  --device cuda:0 \
  --dump-info

test command

python ./tools/test.py \
  configs/mmdet/instance-seg/instance-seg_tensorrt-fp16_dynamic-320x320-1344x1344.py \
  ../mmdetection/configs/mask_rcnn/mask-rcnn_r50_fpn_2x_coco.py \
  --model maskrcnn_f16_d320_1344/end2end.engine \
  --device cuda:0

why your result have " mmengine - INFO - Epoch(test) [5000/5000] "?

MMdetection 3.x with mmengine and mmcv 2.x are new open-mmlab algorithm repos, mmdetection 2.x with mmcv 1.x are old open-mmlab algorithm repos.

I can not understand the bad result reason, since MASK_RCNN have released many years

It’s not problem with mask_rcnn but the conversion process. There are many fixes after mmdeploy 0.7. If you still can’t use main branch and new open-mmlab repos, you can try mmdeploy 0.14.0