opencv: GPU not working with DNN_BACKEND_OPENCV
System information (version)
- OpenCV => 4.1.2
- Operating System / Platform => Windows 64 Bit
- Compiler => Visual Studio 2017
- Cuda => 10.2
Hello !
I use darknet Yolo for object detection and it works very well. Unfortunately with the CPU it’s very slow! I can make Darknet.exe work on the GPU but not in python.
net = cv2.dnn.readNet("dark/yolov3.weights", "dark/yolov3.cfg")
classes = []
with open("dark/coco.names", "r") as f:
classes = [line.strip() for line in f.readlines()]
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
colors = np.random.uniform(0, 255, size=(len(classes), 3))
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_OPENCL_FP16)
Output :
OpenCV(ocl4dnn): consider to specify kernel configuration cache directory via OPENCV_OCL4DNN_CONFIG_PATH parameter. OpenCL program build log: dnn/dummy Status -11: CL_BUILD_PROGRAM_FAILURE -cl-no-subgroup-ifp Error in processing command line: Don’t understand command line argument “-cl-no-subgroup-ifp”!
The execution doesn’t crash but it’s the CPU that does the calculations.
Can u help ? Thx !
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 34 (12 by maintainers)
This was fixed in OpenCV 4.4.0. It’s required for OpenCV 4.3 and below.
@QBarbeAusy
There is a performance regression in cuDNN 8 that affects OpenCV and Darknet. I am working on a update that might get around the regression.
Related discussion at NVIDIA’s developer forums: https://forums.developer.nvidia.com/t/cudnn8-regression-in-algorithm-selection-heuristics/153667/3
There is no reason why all the GPU memory needs to be consumed. More memory consumption does not imply it’s faster.
If downgrading to an older version of cuDNN does not fix, check if the comments in https://github.com/opencv/opencv/issues/17422 answer your question.