opencv: OpenCV + Python multiprocessing breaks on OSX
I’m trying to use OpenCV with Python’s multiprocessing module, but it breaks on me even with very simple code. Here is an example:
import cv2
import multiprocessing
import glob
import numpy
def job(path):
image = cv2.imread(path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
return path
if __name__ == "__main__":
main_image = cv2.imread("./image.png")
main_image = cv2.cvtColor( main_image, cv2.COLOR_BGR2GRAY)
paths = glob.glob("./data/*")
pool = multiprocessing.Pool()
result = pool.map(job, paths)
print 'Finished'
for value in result:
print value
If I remove main_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
script works, but with it in there it doesn’t, even though that line shouldn’t affect jobs processed by the pool.
All image paths lead to simple images, about ~20 of them in total.
Funny enough code works fine if I create images in memory with numpy
instead of reading them with imread()
.
I’m guessing OpenCV uses some shared variables behind the scene that aren’t protected from race conditions.
My environment: Mac OS X 10.10, OpenCV 3.0.0, Python 2.7. The last few lines of stack trace are:
Application Specific Information:
crashed on child side of fork pre-exec
Thread 0 Crashed:: Dispatch queue: com.apple.root.default-qos
0 libdispatch.dylib 0x00007fff8668913f dispatch_apply_f + 769
1 libopencv_core.3.0.dylib 0x000000010ccebd14 cv::parallel_for_(cv::Range const&, cv::ParallelLoopBody const&, double) + 152
2 libopencv_imgproc.3.0.dylib 0x000000010c8b782e void cv::CvtColorLoop<cv::RGB2Gray<unsigned char> >(cv::Mat const&, cv::Mat&, cv::RGB2Gray<unsigned char> const&) + 134
3 libopencv_imgproc.3.0.dylib 0x000000010c8b1fd4 cv::cvtColor(cv::_InputArray const&, cv::_OutputArray const&, int, int) + 23756
4 cv2.so 0x000000010c0ed439 pyopencv_cv_cvtColor(_object*, _object*, _object*) + 687
5 org.python.python 0x000000010bc64968 PyEval_EvalFrameEx + 19480
6 org.python.python 0x000000010bc5fa42 PyEval_EvalCodeEx + 1538\
BTW - I got other OpenCV functions to crash when used with Python multiprocessing too, above is just the smallest example I could produce that reflects the problem.
Also, I got above algorithm (and much more complicated ones) to work in multithreaded C++ programs, using same OpenCV build on same machine, so I guess the issue lies on Python bindings side.
About this issue
- Original URL
- State: open
- Created 9 years ago
- Reactions: 41
- Comments: 60 (7 by maintainers)
Commits related to this issue
- opencv: depend on tbb tbb is required when one wants to use opencv in a multiprocessing application. See this opencv issue thread for details: https://github.com/opencv/opencv/issues/5150#issuecommen... — committed to Homebrew/homebrew-core by papr 7 years ago
- opencv: depend on tbb tbb is required when one wants to use opencv in a multiprocessing application. See this opencv issue thread for details: https://github.com/opencv/opencv/issues/5150#issuecommen... — committed to robohack/homebrew-core by papr 7 years ago
- workaround for r5g2 machine... opencv stuff seems related to https://github.com/opencv/opencv/issues/5150 and https: // github.com / pytorch / pytorch / issues / 1355 multiprocessing.set_start_method... — committed to invett/nn.based.intersection.classficator by trigal 4 years ago
Setting the start method to ‘spawn’ avoids the hang:
As a workaround one can disable multithreading before forking and reenable it in each child process:
It works in Ubuntu (pthreads backend), but I didn’t tested it on Mac (with GCD backend).
I’d suggest to use concurrent.futures.ThreadPoolExecutor from python3.
Ok, I think i nailed down the problem to the implementation of the pthreads thread pool in https://github.com/Itseez/opencv/blob/master/modules/core/src/parallel_pthreads.cpp …
I have recompiled OpenCV “-D BUILD_TBB=ON -D WITH_TBB=ON” and suddenly it works! Using this cmake option i use the intel threading library instead of the standard pthreads. The intel library states it has “better fork support”.
@berak The problem described there is something else as this problem with python only relates to multiple processes, not multi-threading per se.
I am having the same issue on windows 7,
OpenCV 3.3.1
,Python 3.6.4
, from Anaconda.cv2.setNumThreads(0)
andmultiprocessing.set_start_method('spawn', force=True)
do not fix the issue of processes hanging. I am reading a video file in the main thread, sending frames to aQueue
where those frames are then processed bymultiprocessing.Process
instances.Any ideas?
also got hit by this (in https://github.com/pytorch/pytorch/issues/1838’s scenario)
This also explains why setting the start method to “spawn” solves the problem as described by @ekolve. Spawing resets the memory after forking, therefore forcing a re-inizialization of all data structures (as the thread pool) within OpenCV.
I got some hint from https://stackoverflow.com/questions/54013846/pytorch-dataloader-stucked-if-using-opencv-resize-method When I put cv2.setNumThreads(0), it works fine with me.
@alalek Actually I got stung by this problem (6 years after I opened this issue, that must be an achievement ^^) today on OSX 10.15.5 + python 3.8 when using opencv inside celery taks. 😿
Crashed on same
cv2.cvtColor
with the same stack trace pointing tocv::parallel_for_
andlibdispatch::dispatch_apply_f
A bit of into that I think is important - same code, same osx version, but using python 3.5 everything runs smoothly. As far as I can see both python 3.5 and python 3.8 environments use same opencv, celery, etc libraries versions.
Note - python 3.5 fails if only
opencv-python
(4.2.0.34
) package is installed, but withopencv-python-headless
(4.2.0.34
) no crashes occur. Installingopencv-python-headless
on python 3.8 doesn’t solve the issue.Highlighting this since it would suggest that the issue with forking isn’t something that can’t be solved on OSX.
A little bit more info: Above happens when celery runs using its default
prefork
method for worker pool https://docs.celeryproject.org/en/stable/reference/celery.bin.worker.html#cmdoption-celery-worker-pWhen using
threads
and (obviously)solo
worker pool options, opencv doesn’t crash.There are several related updates:
'spawn'
is default on MacOS: https://github.com/python/cpython/pull/13603It worked for me with python3
Idea to solve the problem within the c++ code: Register a fork handler via http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html and reset the state of the thread pool on forking the process. This has the effect that the next call to cvtColor (or any other method using the thread pool) initializes the thread pool. thus starts the threads.
At the moment it seems that the code is waiting infinitley for a condition that is never signaled because the number of threads is not as expected after forking: https://github.com/Itseez/opencv/blob/master/modules/core/src/parallel_pthreads.cpp#L440 The “notify_complete” method only signals the condition if all of the threads completed its work …
I’m currently testing whether this maybe originates from the thread pool used in functions like cvtColor and detectAndCompute. In my understanding forking (without an “exec” call afterwords as done in python under linux) could be a problem, but thats only a guess. I’ve read that forking only clones the current thread. In the case the thread pool may assume wrongly that it has more threads that it has …
I ended up switching celery to use eventlet to work around the problem.
I found that installing
opencv-python-headless
fixed the problem for me with python 3.8I am using multiprocessing to speed up ARUCO target detection on a raspberry pi. (Keep in mind I have gotten this working using no multiprocessing.)
Using this code: `` workers = [] for i in range(num_workers): workers.append(Worker_Bee(tasks,results,i))
for worker in workers: worker.start()
for i in range(20): tasks.put(Image_Processor(vs.read())) time.sleep(.5)
`` The Image_Processor looks at the frame vs.read() which is an image. The image processor tries to find targets in the image. The worker process Image_Processor hangs on the following line: corners, ids, rejectedImgPoints = aruco.detectMarkers(gray, self.aruco_dict, parameters=self.parameters) where gray is the image vs.read() I’ve check the arguments to aruco.detectMarkers and they seem to be correct (and as I’ve said before, I’ve ran this before without multiprocessing fine). There is no error message, and looking at top, I see the python process spawning but then quickly going away. The parent process stays.
I’ve seen similar problems with openCV hanging on cvtColor or resize, but all of their problems are fixed for me (compiling with TBB options and such).
Any advice on how to get this working?
I’m actually getting a similar bug using the resize function in a subprocess. I’m using python 2.7 and opencv 2.4.11
But it looks like the problem is a bit different because I can run resize function inside a subprocess with importing cv2 globally. The following code is just resizing an image and works fine.
However if I change my main function to also resize an image, my subprocess crash.
The only solution which works so far it’s too create a subprocess every time i need to use opencv which is pretty annoying. I also tried the cv2.setNumThreads solutions and it doesn’t work.
@mshabunin Can you change the category of this issue? This is not related to the python bindings, but to the core.
Here is another library with the same problem: https://github.com/flame/blis/issues/30
I’ll try to prepare a pull request for this.
can’t reproduce it on win