CLBlast: clblast_sample_cache_c segfaults

I’ve build CLBlast as described in the readme - using -DVERBOSE=ON after getting the segfault in order to get more details. I get

hendrik@i7-8700K:~/Downloads/CLBlast/build$ ./clblast_sample_cache_c Starting caching sample with an empty cache
[DEBUG] Searching database for kernel 'Xdot'
[DEBUG] Device type 'GPU'; vendor 'NVIDIA'
[DEBUG] Device name 'GeForce GTX 1070 Ti'; architecture 'SM6.1'
[DEBUG] Found architectures of vendor 'NVIDIA' and type 'GPU'
[DEBUG] Found devices of architecture type 'SM6.1'
[DEBUG] Found parameters for device type 'default'
[DEBUG] ASUM_Xdot_1024_32
[DEBUG] Compiling routine 'ASUM-32 (single)'
[DEBUG] Completed compilation in 0.87 ms
[DEBUG] Running kernel 'Xasum'
[DEBUG] Completed kernel in 0.06 ms
[DEBUG] Running kernel 'XasumEpilogue'
[DEBUG] Completed kernel in 0.01 ms
Completed routine with status 0 in 2.438 ms
[DEBUG] ASUM_Xdot_1024_32
Segmentation fault (core dumped)
hendrik@i7-8700K:~/Downloads/CLBlast/build$ 

Valgrind gives

hendrik@i7-8700K:~/Downloads/CLBlast/build$ valgrind ./clblast_sample_cache_c 
==20409== Memcheck, a memory error detector
==20409== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==20409== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==20409== Command: ./clblast_sample_cache_c
==20409== 
==20409== Warning: noted but unhandled ioctl 0x30000001 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: noted but unhandled ioctl 0x27 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: noted but unhandled ioctl 0x7ff with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: noted but unhandled ioctl 0x25 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: noted but unhandled ioctl 0x37 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: noted but unhandled ioctl 0x17 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: set address range perms: large range [0x200000000, 0x300200000) (noaccess)
==20409== Warning: set address range perms: large range [0x10006000000, 0x10106000000) (noaccess)
==20409== Warning: set address range perms: large range [0x59e43000, 0x159e42000) (defined)
==20409== Syscall param ioctl(generic) points to uninitialised byte(s)
==20409==    at 0x57165D7: ioctl (syscall-template.S:78)
==20409==    by 0x81535B8: OCLRT::Drm::ioctl(unsigned long, void*) (in /opt/intel/opencl/libigdrcl.so)
==20409==    by 0x8155659: OCLRT::OSTimeLinux::timestampTypeDetect() (in /opt/intel/opencl/libigdrcl.so)
==20409==    by 0x8155781: OCLRT::OSTime::create(OCLRT::OSInterface*) (in /opt/intel/opencl/libigdrcl.so)
==20409==    by 0x8125B3E: OCLRT::Device::createDeviceImpl(OCLRT::HardwareInfo const*, bool, OCLRT::Device&) (in /opt/intel/opencl/libigdrcl.so)
==20409==    by 0x8157919: OCLRT::Platform::initialize() (in /opt/intel/opencl/libigdrcl.so)
==20409==    by 0x80F8877: clGetPlatformIDs (in /opt/intel/opencl/libigdrcl.so)
==20409==    by 0x53F9B97: ??? (in /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0)
==20409==    by 0x53FA0D2: clGetPlatformIDs (in /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0)
==20409==    by 0x108DC5: main (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==  Address 0x1ffefff828 is on thread 1's stack
==20409==  in frame #2, created by OCLRT::OSTimeLinux::timestampTypeDetect() (???:)
==20409== 
Starting caching sample with an empty cache
==20409== Warning: noted but unhandled ioctl 0x19 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: set address range perms: large range [0x10106000000, 0x10206000000) (noaccess)
==20409== Warning: noted but unhandled ioctl 0x21 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: noted but unhandled ioctl 0x1b with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
[DEBUG] Searching database for kernel 'Xdot'
[DEBUG] Device type 'GPU'; vendor 'NVIDIA'
[DEBUG] Device name 'GeForce GTX 1070 Ti'; architecture 'SM6.1'
[DEBUG] Found architectures of vendor 'NVIDIA' and type 'GPU'
[DEBUG] Found devices of architecture type 'SM6.1'
[DEBUG] Found parameters for device type 'default'
[DEBUG] ASUM_Xdot_1024_32
[DEBUG] Compiling routine 'ASUM-32 (single)'
[DEBUG] Completed compilation in 302.08 ms
[DEBUG] Running kernel 'Xasum'
[DEBUG] Completed kernel in 27.46 ms
[DEBUG] Running kernel 'XasumEpilogue'
[DEBUG] Completed kernel in 1.64 ms
Completed routine with status 0 in 432.701 ms
==20409== Warning: noted but unhandled ioctl 0x42 with no size/direction hints.
==20409==    This could cause spurious value errors to appear.
==20409==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==20409== Warning: set address range perms: large range [0x10206200000, 0x10306200000) (noaccess)
[DEBUG] ASUM_Xdot_1024_32
==20409== Invalid read of size 8
==20409==    at 0x53FA785: clBuildProgram (in /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0)
==20409==    by 0x4EE6F2C: Build (clpp11.hpp:474)
==20409==    by 0x4EE6F2C: clblast::Routine::InitProgram(std::initializer_list<char const*>) (routine.cpp:100)
==20409==    by 0x4EE8B56: clblast::Routine::Routine(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, clblast::Precision, std::vector<clblast::database::DatabaseEntry, std::allocator<clblast::database::DatabaseEntry> > const&, std::initializer_list<char const*>) (routine.cpp:66)
==20409==    by 0x4FBF4C8: clblast::Xasum<float>::Xasum(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (xasum.cpp:27)
==20409==    by 0x4EFA309: clblast::StatusCode clblast::Asum<float>(unsigned long, _cl_mem*, unsigned long, _cl_mem*, unsigned long, unsigned long, _cl_command_queue**, _cl_event**) (clblast.cpp:397)
==20409==    by 0x4F210CB: CLBlastSasum (clblast_c.cpp:633)
==20409==    by 0x109161: run_example_routine (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==    by 0x108E4D: main (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==  Address 0x24cb56b0 is 0 bytes inside a block of size 104 free'd
==20409==    at 0x4C30D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20409==    by 0x6A10C31: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.390.48)
==20409==    by 0x6A110E6: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.390.48)
==20409==    by 0x6A1ACFA: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.390.48)
==20409==    by 0x4EE6875: ~Program (clpp11.hpp:466)
==20409==    by 0x4EE6875: clblast::Routine::InitProgram(std::initializer_list<char const*>) (routine.cpp:99)
==20409==    by 0x4EE8B56: clblast::Routine::Routine(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, clblast::Precision, std::vector<clblast::database::DatabaseEntry, std::allocator<clblast::database::DatabaseEntry> > const&, std::initializer_list<char const*>) (routine.cpp:66)
==20409==    by 0x4FBF4C8: clblast::Xasum<float>::Xasum(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (xasum.cpp:27)
==20409==    by 0x4EFA309: clblast::StatusCode clblast::Asum<float>(unsigned long, _cl_mem*, unsigned long, _cl_mem*, unsigned long, unsigned long, _cl_command_queue**, _cl_event**) (clblast.cpp:397)
==20409==    by 0x4F210CB: CLBlastSasum (clblast_c.cpp:633)
==20409==    by 0x109161: run_example_routine (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==    by 0x108E4D: main (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==  Block was alloc'd at
==20409==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20409==    by 0x6A10A6A: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.390.48)
==20409==    by 0x6A1A7AE: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.390.48)
==20409==    by 0x6A1AA04: ??? (in /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.390.48)
==20409==    by 0x4EE6629: Program (clpp11.hpp:457)
==20409==    by 0x4EE6629: clblast::Routine::InitProgram(std::initializer_list<char const*>) (routine.cpp:99)
==20409==    by 0x4EE8B56: clblast::Routine::Routine(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, clblast::Precision, std::vector<clblast::database::DatabaseEntry, std::allocator<clblast::database::DatabaseEntry> > const&, std::initializer_list<char const*>) (routine.cpp:66)
==20409==    by 0x4FBF4C8: clblast::Xasum<float>::Xasum(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (xasum.cpp:27)
==20409==    by 0x4EFA309: clblast::StatusCode clblast::Asum<float>(unsigned long, _cl_mem*, unsigned long, _cl_mem*, unsigned long, unsigned long, _cl_command_queue**, _cl_event**) (clblast.cpp:397)
==20409==    by 0x4F210CB: CLBlastSasum (clblast_c.cpp:633)
==20409==    by 0x109161: run_example_routine (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==    by 0x108E4D: main (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409== 
==20409== Invalid read of size 8
==20409==    at 0x53FA788: clBuildProgram (in /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0)
==20409==    by 0x4EE6F2C: Build (clpp11.hpp:474)
==20409==    by 0x4EE6F2C: clblast::Routine::InitProgram(std::initializer_list<char const*>) (routine.cpp:100)
==20409==    by 0x4EE8B56: clblast::Routine::Routine(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, clblast::Precision, std::vector<clblast::database::DatabaseEntry, std::allocator<clblast::database::DatabaseEntry> > const&, std::initializer_list<char const*>) (routine.cpp:66)
==20409==    by 0x4FBF4C8: clblast::Xasum<float>::Xasum(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (xasum.cpp:27)
==20409==    by 0x4EFA309: clblast::StatusCode clblast::Asum<float>(unsigned long, _cl_mem*, unsigned long, _cl_mem*, unsigned long, unsigned long, _cl_command_queue**, _cl_event**) (clblast.cpp:397)
==20409==    by 0x4F210CB: CLBlastSasum (clblast_c.cpp:633)
==20409==    by 0x109161: run_example_routine (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==    by 0x108E4D: main (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==  Address 0xf0 is not stack'd, malloc'd or (recently) free'd
==20409== 
==20409== 
==20409== Process terminating with default action of signal 11 (SIGSEGV)
==20409==  Access not within mapped region at address 0xF0
==20409==    at 0x53FA788: clBuildProgram (in /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0)
==20409==    by 0x4EE6F2C: Build (clpp11.hpp:474)
==20409==    by 0x4EE6F2C: clblast::Routine::InitProgram(std::initializer_list<char const*>) (routine.cpp:100)
==20409==    by 0x4EE8B56: clblast::Routine::Routine(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, clblast::Precision, std::vector<clblast::database::DatabaseEntry, std::allocator<clblast::database::DatabaseEntry> > const&, std::initializer_list<char const*>) (routine.cpp:66)
==20409==    by 0x4FBF4C8: clblast::Xasum<float>::Xasum(clblast::Queue&, _cl_event**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (xasum.cpp:27)
==20409==    by 0x4EFA309: clblast::StatusCode clblast::Asum<float>(unsigned long, _cl_mem*, unsigned long, _cl_mem*, unsigned long, unsigned long, _cl_command_queue**, _cl_event**) (clblast.cpp:397)
==20409==    by 0x4F210CB: CLBlastSasum (clblast_c.cpp:633)
==20409==    by 0x109161: run_example_routine (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==    by 0x108E4D: main (in /home/hendrik/Downloads/CLBlast/build/clblast_sample_cache_c)
==20409==  If you believe this happened as a result of a stack
==20409==  overflow in your program's main thread (unlikely but
==20409==  possible), you can try to increase the size of the
==20409==  main thread stack using the --main-stacksize= flag.
==20409==  The main thread stack size used in this run was 8388608.
==20409== 
==20409== HEAP SUMMARY:
==20409==     in use at exit: 38,801,753 bytes in 44,633 blocks
==20409==   total heap usage: 59,172 allocs, 14,539 frees, 82,720,941 bytes allocated
==20409== 
==20409== LEAK SUMMARY:
==20409==    definitely lost: 0 bytes in 0 blocks
==20409==    indirectly lost: 0 bytes in 0 blocks
==20409==      possibly lost: 376,376 bytes in 3,049 blocks
==20409==    still reachable: 38,425,377 bytes in 41,584 blocks
==20409==                       of which reachable via heuristic:
==20409==                         stdstring          : 196 bytes in 6 blocks
==20409==         suppressed: 0 bytes in 0 blocks
==20409== Rerun with --leak-check=full to see details of leaked memory
==20409== 
==20409== For counts of detected and suppressed errors, rerun with: -v
==20409== Use --track-origins=yes to see where uninitialised values come from
==20409== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
hendrik@i7-8700K:~/Downloads/CLBlast/build$ 

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

@ranocha And about:

The test failures I’ve posted above seem to be caused by CLBLAS

Is that #293 you are referring to? If so, could you close that? Or is that still an open issue?

No, I was referring to the tests using the NVIdia driver I’ve posted earlier in this issue (https://github.com/CNugteren/CLBlast/issues/291#issuecomment-400533935). #293 is still open.

Yes, clblast_sample_cache_c runs without problems on both systems. Thank you!

The OS is Kubuntu 18.04. I will run the regular tests when I’m back at my computer.