opencv: `OpenCL error CL_INVALID_CONTEXT` when `~UMat()` and default OpenCL context has changed
When a UMat
is created in an opencv opencl context (e.g. Intel gpu), then the opencv opencl context is changed to another device (e.g. NVidia gpu), and then the UMat is destructed… it leads to errors of CL_INVALID_CONTEXT
.
I think this is due to OpenCLAllocator::unmap()
wrongly using the command queue of the current “default” opencv opencl context for the unmap. It should instead use the command queue for the opencl context that OpenCLAllocator::map()
itself.
This issue probably also exists in OpenCLAllocator::deallocate_()
I would appreciate some feedback on this from the core OpenCV team. I’ll investigate myself in parallel.
System information (version)
- OpenCV 4.5.3
- Microsoft Windows [Version 10.0.19043.1110]
- VS2019 v16.10.4
- Laptop with Intel integrated GPU, and NVidia discrete GPU
Reproduction
- Create opencv app that uses opencl and enables the user to change opencl contexts via
cv::ocl::Context::create(config)
,cv::ocl::OpenCLExecutionContext::create
, andexec_context.bind()
- Run app
- Set to Intel GPU
- Create UMats
- Change context to NVidia GPU, but don’t proactively
~UMat()
. Instead let some UMats exists from both contexts. - Now destroy the older Intel gpu UMats.
Result
For each old intel UMat that is destructed…
OpenCV(4.5.3) Error: Unknown error code -220 (OpenCL error CL_INVALID_CONTEXT (-34) during call: clEnqueueUnmapMemObject(handle=00000000FDD6BE30, data=00000000E84A0000, [sz=99532800])) in cv::ocl::OpenCLAllocator::unmap, file ..\modules\core\src\ocl.cpp, line 5926
Exception thrown at 0x00007FFDE3EA4ED9 in Max.exe: Microsoft C++ exception: cv::Exception at memory location 0x00000000007DECF0.
exception in outputmatrix() OpenCV(4.5.3) ..\modules\core\src\ocl.cpp:5926: error: (-220:Unknown error code -220) OpenCL error CL_INVALID_CONTEXT (-34) during call: clEnqueueUnmapMemObject(handle=00000000FDD6BE30, data=00000000E84A0000, [sz=99532800]) in function 'cv::ocl::OpenCLAllocator::unmap'
Do the reverse (old nvidia and new intel), then same function fails but different line because of a different map/copy approach
OpenCV(4.5.3) Error: Unknown error code -220 (OpenCL error CL_INVALID_MEM_OBJECT (-38) during call: clEnqueueWriteBuffer(q, handle=000000006EFD34C0, CL_TRUE, 0, sz=99532800, data=000000011071B080, 0, 0, 0)) in cv::ocl::OpenCLAllocator::unmap, file ..\modules\core\src\ocl.cpp, line 5947
Exception thrown at 0x00007FFDE3EA4ED9 in Max.exe: Microsoft C++ exception: cv::Exception at memory location 0x00000000007DECF0.
exception in outputmatrix() OpenCV(4.5.3) ..\modules\core\src\ocl.cpp:5947: error: (-220:Unknown error code -220) OpenCL error CL_INVALID_MEM_OBJECT (-38) during call: clEnqueueWriteBuffer(q, handle=000000006EFD34C0, CL_TRUE, 0, sz=99532800, data=000000011071B080, 0, 0, 0) in function 'cv::ocl::OpenCLAllocator::unmap'
Expected
No errors. And the old Intel UMats have their resources released.
Notes
In unmap()
, all three methods (SVM, map, copy) use a command queue gotten via
This is an errant approach. When releasing opencl resources for a UMat, it needs to use the opencl context that allocated those resources for that specific Umat. Otherwise, it is unsafe and will lead to errors like above when contexts are changed.
That means the UMatData
needs to know its own opencl context or command queue.
Does this knowledge already exist on the UMatData
?
Is it available at UMatData::allocatorContext
, prevAllocator
, or currAllocator
?
If the knowledge is already on UMatData (or can be easily added), then this function’s code can be updated to use that command queue.
OpenCLAllocator::deallocate_()
I think has a similar errant approach but its code is different. For example
- SVM approach it uses
Context::getDefault()
andQueue::getDefault().ptr()
. I think it needs to use the context/queues from original allocation context. - non-SVM uses
cl_command_queue q = (cl_command_queue)Queue::getDefault().ptr()
. I think it needs to use the context/queues from original allocation context. - etc.
Related issues
There are a number of related issues about “switching opencl devices” but none are so exact as to isolate this single line of code. If we can address this, then it might contribute to those other issues. I’m subscribed to several of them myself.
Resoving this issue will resolve issue #18919 Related to issue #9593, issue #6926, issue #8035
Issue submission checklist
- I report the issue, it’s not a question
- I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found solution
- I updated to latest OpenCV version and the issue is still there
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 4
- Comments: 18 (18 by maintainers)
I’m unsubscribing as I’ve fixed this years ago in my fork
😝 Agreed. I would have to manually destruct anything that needs the cl_context handle in
Context::~Impl
because the code there manually destructs the raw handle pointer. Rather than writing yet more raw pointer management code, I made a commit that stores thecl_context
handle as a std::unique_ptr with a custom deleter clReleaseContext(). C++ RAII. I have confirmed that the handle is listed above everything that could need it (like the BufferPools) that need it. Therefore, C++ will destruct them in reverse order listed in code.I’ve also found many thread unsafe functions that are called once for initialization. These are the same style bug that found in the core fastMalloc() function that I fixed. I’ve made all the changes in a single commit – the involved code is simplier and easier to read.
I’m aware the CI is failing tests on this PR. My local tests with my app works well, but those DNN tests use things differently. I’ll get to those bugs this week.
While all the changes are within
ocl.cpp
, they are throughout that code. It could be helpful to review the changes in units of commits, rather than all of it all at once. I’ll continue to push separate commits. And after it all works and approved, I’ll forcepush a squash.When today’s PR push, behavior of the global Context list and BufferPools has changed. I don’t see these as a bug. Instead, this is new behavior and I’m now aware OpenCV has defined any rule on this topic. I’m open to discussion on it. And I don’t have a strong preference at this time.
OpenCL(SVM)BufferPoolImpl’s now know the Context for which they are managing buffers. This faciliates correct allocation/deallocation of a buffer (for a UMat) to the cl_context for that UMat. There would be leaks or errors otherwise.
There is a sideaffect. The behavior of 3 Aug (above post) having a big list with nulls and only two useful entries has changed. Now I am seeing only size()=2. Why? Because when all the UMats have released their ownership on the Context, there is still 2-3 owners remaining…the (default, hostptr, svm)BufferPoolImpl’s. There isn’t anything that proactively tells those Bufferpools to delete/deallocate/release. Instead they are usually released when a Context is released/deallocated. But since the Bufferpools hold ownership of their parent (Context), no one initiates the chain of release.
I am imagine some new behavior rules
cv::ocl::getOpenCLAllocator()->getBufferPoolController()->freeAllReservedBuffers()
before a new bind()…because those only work on the current bind() ExecContext. Naturally, OpenCL runtime choses when its actually processes those commands and completes the frees. However, that just frees the entries. The BufferPoolImpls and its parent Context would remain.Comments?