openvino: [Bug] compile_model() Runtime crash in opencl
System information (version)
- OpenVINO => openvino_2022.1.0.643
- Operating System / Platform => Ubuntu 20.04 64 Bit
- Compiler => pre compiled apt package Ubuntu 20.04
- Problem classification: Runtime Crash
- Framework: TensorFlow (if applicable)
Detailed description
We would like to switch to the new OpenVINO™ API 2.0 from our old implementation. We have an issue with the compile_model() for GPU target. It seems like a timing issue since its not breaking every time when we would like to load the model but every 40% of the time it crashes.
Steps to reproduce
ov::Core core;
std::shared_ptr<ov::Model> model;
std::string target = "GPU";
... (read model) ...
ov::set_batch(model, ov::Dimension(1, 16));
compiledModel = core.compile_model(model, target); <-- Crash
Based on the attached image it seems like an libopencl-clang.so.11.
Operating system: Linux 5.15.0 -52-generic #58~20.04.1-Ubuntu SMP Thu Oct 13 13:09:46 UTC 2022 x86_64 CPU: amd64 family 6 model 151 stepping 2 20 CPUs
Crash reason: SIGSEGV / SEGV_MAPERR Crash address: 0x48 Process uptime: 35 seconds
Thread 176 (crashed)
0 libopencl-clang.so.11!llvm::BasicBlock::dropAllReferences() + 0x0
rax = 0x00007fd07c5f9720 rdx = 0x00007fd07d6bf440
rcx = 0x00007fd3a39f2bf6 rbx = 0x0000000000000030
rsi = 0x0000000000000081 rdi = 0x0000000000000018
rbp = 0x00007fd07c5f9560 rsp = 0x00007fd05fffc788
r8 = 0x00007fd07c8479d8 r9 = 0x0000000000001fff
r10 = 0x0000000000000000 r11 = 0x0000000000000246
r12 = 0x00007fd07c5f95a8 r13 = 0x00007fd07c9be890
r14 = 0x00007fd07c5f9560 r15 = 0x00007fd05fffc830
rip = 0x00007fd2dded6770
Found by: given as instruction pointer in context
1 libopencl-clang.so.11!llvm::Function::dropAllReferences() + 0x2d
rbx = 0x0000000000000030 rbp = 0x00007fd07c5f9560
rsp = 0x00007fd05fffc790 r12 = 0x00007fd07c5f95a8
r13 = 0x00007fd07c9be890 r14 = 0x00007fd07c5f9560
r15 = 0x00007fd05fffc830 rip = 0x00007fd2ddf4667e
Found by: call frame info
2 libopencl-clang.so.11!llvm::Function::~Function() + 0xf
rbx = 0x00007fd07c5f9560 rbp = 0x00007fd07c745150
rsp = 0x00007fd05fffc7b0 r12 = 0x00007fd07c8479b0
r13 = 0x00007fd07c9be890 r14 = 0x00007fd07c5f9560
r15 = 0x00007fd05fffc830 rip = 0x00007fd2ddf4ded0
Found by: call frame info
3 libopencl-clang.so.11!llvm::Function::eraseFromParent() + 0x46
rbx = 0x00007fd07c5f9560 rbp = 0x00007fd07c745150
rsp = 0x00007fd05fffc7e0 r12 = 0x00007fd07c8479b0
r13 = 0x00007fd07c9be890 r14 = 0x00007fd07d9f3e50
r15 = 0x00007fd05fffc830 rip = 0x00007fd2ddf4dfd7
Found by: call frame info
4 libopencl-clang.so.11!SPIRV::OCLToSPIRVBase::runOCLToSPIRV(llvm::Module&) + 0x206
rbx = 0x00007fd07c8479d8 rbp = 0x00007fd07c745150
rsp = 0x00007fd05fffc800 r12 = 0x00007fd07c8479b0
r13 = 0x00007fd07c9be890 r14 = 0x00007fd07d9f3e50
r15 = 0x00007fd05fffc830 rip = 0x00007fd2dbf14547
Found by: call frame info
5 libopencl-clang.so.11!llvm::legacy::PassManagerImpl::run(llvm::Module&) + 0x406
rbx = 0x00007fd05fffc8d0 rbp = 0x00007fd05fffc950
rsp = 0x00007fd05fffc8a0 r12 = 0x00007fd07cb06f00
r13 = 0x00007fd07c8937f0 r14 = 0x00007fd07c847a08
r15 = 0x00007fd07c893810 rip = 0x00007fd2ddf99737
Found by: call frame info
6 libopencl-clang.so.11!llvm::writeSpirv(llvm::Module*, SPIRV::TranslatorOpts const&, std::ostream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) + 0xd1
rbx = 0x00007fd07d698d30 rbp = 0x0000000000000001
rsp = 0x00007fd05fffc960 r12 = 0x00007fd05fffc970
r13 = 0x00007fd07cb06f00 r14 = 0x00007fd05fffcbe0
r15 = 0x00007fd05fffcbc0 rip = 0x00007fd2dbe5b682
Found by: call frame info
7 libopencl-clang.so.11!Compile + 0x24b2
rbx = 0x00007fd05fffccd8 rbp = 0x00007fd05fffcf90
rsp = 0x00007fd05fffc9d0 r12 = 0x00007fd07cb06f00
r13 = 0x00007fd05fffcb60 r14 = 0x00007fd05fffcbb0
r15 = 0x00007fd07cddc7e0 rip = 0x00007fd2dbdf76c3
Found by: call frame info
8 libigdfcl.so.1 + 0x46fad
rbx = 0x00007fd05fffd380 rbp = 0x00007fd05fffd290
rsp = 0x00007fd05fffcfa0 r12 = 0x00007fd05fffd070
r13 = 0x00007fd05fffd0b0 r14 = 0x00007fd05fffd010
r15 = 0x00007fd05fffd0d0 rip = 0x00007fd394405fae
Found by: call frame info
9 libigdfcl.so.1 + 0x48805
rbp = 0x00007fd05fffd5f0 rsp = 0x00007fd05fffd2a0
rip = 0x00007fd394407806
Found by: previous frame's frame pointer
10 libigdfcl.so.1 + 0x56dcf
rbp = 0x00007fd05fffd760 rsp = 0x00007fd05fffd600
rip = 0x00007fd394415dd0
Found by: previous frame's frame pointer
11 libigdrcl.so + 0x5cd8c0
rbp = 0x00007fd05fffd8c0 rsp = 0x00007fd05fffd770
rip = 0x00007fd2e3ae88c1
Found by: previous frame's frame pointer
12 libigdrcl.so + 0x10ca67
rbp = 0x00007fd05fffda30 rsp = 0x00007fd05fffd8d0
rip = 0x00007fd2e3627a68
Found by: previous frame's frame pointer
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15 (2 by maintainers)
@zabomate First of all, I’d suggest trying another OCL runtime versions as crash happens somewhere in the driver libs. If OCL runtime version switch doesn’t help, please provide some info on your device (clinfo output) and OCL runtime version(s) that you’ve tried in addition to things mentioned by @Iffa-Meah