OpenBLAS: Segmentation fault in generic/laswp_k_4

I was testing rocSOLVER on Gentoo with openblas as the CPU blas reference, and one test failed with segmentation fault:

$OMP_NUM_THREADS=1 LD_LIBRARY_PATH=/fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-4.3.0_build/library/src ./rocsolver-test --gtest_filter="daily_lapack/LASWP.__float/3"
rocSOLVER version 3.13.0. (with rocBLAS 2.39.0.)
Query device success: there are 1 devices
-------------------------------------------------------------------------------
Device ID 0 : Vega 20 [Radeon VII]
with 17.2 GB memory, max. SCLK 1801 MHz, max. MCLK 1000 MHz, compute capability 9.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
Note: Google Test filter = daily_lapack/LASWP.__float/3
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from daily_lapack/LASWP
[ RUN      ] daily_lapack/LASWP.__float/3
zsh: segmentation fault  OMP_NUM_THREADS=1 LD_LIBRARY_PATH= ./rocsolver-test

If I use the reference lapack, no error occur:

$eselect lapack set reference
>>> Regenerating /etc/ld.so.cache...
$OMP_NUM_THREADS=1 LD_LIBRARY_PATH=/fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-4.3.0_build/library/src ./rocsolver-test --gtest_filter="daily_lapack/LASWP.__float/3"
rocSOLVER version 3.13.0. (with rocBLAS 2.39.0.)
Query device success: there are 1 devices
-------------------------------------------------------------------------------
Device ID 0 : Vega 20 [Radeon VII]
with 17.2 GB memory, max. SCLK 1801 MHz, max. MCLK 1000 MHz, compute capability 9.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
Note: Google Test filter = daily_lapack/LASWP.__float/3
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from daily_lapack/LASWP
[ RUN      ] daily_lapack/LASWP.__float/3
[       OK ] daily_lapack/LASWP.__float/3 (195 ms)
[----------] 1 test from daily_lapack/LASWP (195 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (196 ms total)
[  PASSED  ] 1 test.
rocSOLVER version 3.13.0. (with rocBLAS 2.39.0.)

Then I build openblas and rocSOLVER with debug flags and run gdb:

$eselect lapack set openblas
>>> Regenerating /etc/ld.so.cache...
$OMP_NUM_THREADS=1 LD_LIBRARY_PATH=/fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-4.3.0_build/library/src gdb --args ./rocsolver-test --gtest_filter="daily_lapack/LASWP.__float/3"
GNU gdb (Gentoo 11.1 vanilla) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./rocsolver-test...
(gdb) r
Starting program: /fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-4.3.0_build/clients/staging/rocsolver-test --gtest_filter=daily_lapack/LASWP.__float/3
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/opt/gentoo/lib64/libthread_db.so.1".
[New Thread 0x7fffedc75640 (LWP 4570)]
rocSOLVER version 3.13.0. (with rocBLAS 2.39.0.)
[New Thread 0x7fffed474640 (LWP 4571)]
Query device success: there are 1 devices
-------------------------------------------------------------------------------
Device ID 0 : Vega 20 [Radeon VII]
with 17.2 GB memory, max. SCLK 1801 MHz, max. MCLK 1000 MHz, compute capability 9.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
Note: Google Test filter = daily_lapack/LASWP.__float/3
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from daily_lapack/LASWP
[ RUN      ] daily_lapack/LASWP.__float/3
[New Thread 0x7fffd9fff640 (LWP 4784)]
[Thread 0x7fffd9fff640 (LWP 4784) exited]
[New Thread 0x7fffdd37f640 (LWP 4785)]
[New Thread 0x7fffdd15f640 (LWP 4792)]

Thread 1 "rocsolver-test" received signal SIGSEGV, Segmentation fault.
0x00007fffee86a47f in slaswp_minus (n=192, k1=19, k2=100, dummy1=0, a=0x90a2f2c, lda=100, dummy2=0x0, dumy3=0, ipiv=0x4dc119d8, incx=-2) at ../generic/laswp_k_4.c:143
143               B2 = *b2;
(gdb) bt
#0  0x00007fffee86a47f in slaswp_minus (n=192, k1=19, k2=100, dummy1=0, a=0x90a2f2c, lda=100, dummy2=0x0, dumy3=0, ipiv=0x4dc119d8, incx=-2) at ../generic/laswp_k_4.c:143
#1  0x00007ffff755817d in slaswp_ (N=0x7fffffffbedc, a=0x90a2f30, LDA=0x7fffffffbecc, K1=0x7fffffffbec8, K2=0x7fffffffbec4, ipiv=0x4dc116c0, INCX=0x7fffffffbef0) at lapack/laswp.c:85
#2  0x0000000000ed3d55 in cblas_laswp<float> (n=192, A=0x90a2f30, lda=100, k1=20, k2=100, ipiv=0x4dc116c0, inc=-2) at /fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-rocm-4.3.0/clients/common/lapack_host_reference.cpp:1680
#3  0x0000000000bf8874 in laswp_getError<float, device_strided_batch_vector<float, 0ul, float>, device_strided_batch_vector<int, 0ul, int>, host_strided_batch_vector<float>, host_strided_batch_vector<int> > (handle=0x4c08cd60, n=192,
    dA=..., lda=100, k1=20, k2=100, dIpiv=..., inc=-2, hA=..., hAr=..., hIpiv=..., max_err=0x7fffffffc420) at /fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-rocm-4.3.0/clients/include/testing_laswp.hpp:118
#4  0x0000000000bf7862 in testing_laswp<float> (argus=...) at /fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-rocm-4.3.0/clients/include/testing_laswp.hpp:259
#5  0x0000000000bf1171 in LASWP::run_tests<float> (this=0x4c0809c0) at /fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-rocm-4.3.0/clients/gtest/laswp_gtest.cpp:85
#6  0x0000000000bf05b5 in LASWP___float_Test::TestBody (this=0x4c0809c0) at /fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-rocm-4.3.0/clients/gtest/laswp_gtest.cpp:93
#7  0x00007ffff491dd86 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) () from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#8  0x00007ffff4916037 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) () from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#9  0x00007ffff48eac32 in testing::Test::Run() () from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#10 0x00007ffff48eb638 in testing::TestInfo::Run() () from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#11 0x00007ffff48ebefc in testing::TestSuite::Run() () from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#12 0x00007ffff48fb480 in testing::internal::UnitTestImpl::RunAllTests() () from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#13 0x00007ffff491ed33 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
   from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#14 0x00007ffff4917173 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
   from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#15 0x00007ffff48f9c82 in testing::UnitTest::Run() () from /opt/gentoo/usr/lib64/libgtest.so.1.11.0
#16 0x0000000000ec9821 in RUN_ALL_TESTS () at /opt/gentoo/usr/include/gtest/gtest.h:2490
#17 0x0000000000ec97ba in main (argc=1, argv=0x7fffffffccc8) at /fast/portage/sci-libs/rocSOLVER-4.3.0/work/rocSOLVER-rocm-4.3.0/clients/gtest/rocsolver_gtest_main.cpp:49

By setting a breakpoint at generic/laswp_k_4.c:143, I found out that pointer b2 is invalid (so b4, b6, b8 is also invalid) which causes the segmentaion fault.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 27 (2 by maintainers)

Most upvoted comments

If you dont set OMP_THREADS and use all CPUs- I assume same happened?

Yes, I use 1 thread for simplicity of back-tracing.