alpaka: `develop` branch broken on oneAPI

(initially reported by @AuroraPerego)

Hi, looks like the current HEAD of the develop branch is broken for the SYCL/oneAPI GPU target:

84% tests passed, 7 tests failed out of 43

Total Test time (real) =  70.06 sec

The following tests FAILED:
          8 - parallelLoopPatterns (Subprocess aborted)
         10 - randomCells2D (Failed)
         17 - matMulTest (Failed)
         32 - memBufTest (Not Run)
         33 - bufSlicingTest (Subprocess aborted)
         36 - memViewTest (Subprocess aborted)
         42 - warpTest (Failed)

This is one an Intel Data Center GPU Max 1100 (Ponte Vecchio) with oneAPI DPC++/C++ Compiler 2023.2.0 (or 2023.2.1, there seems to be some confusion with minor versions):

$ git log --oneline -n1
59002235d6d (HEAD -> develop, origin/develop, origin/HEAD) Fix unsigned integer conversion

$ mkdir -p build/sycl_gpu

$ cd build/sycl_gpu

$ CXXFLAGS="-g -O2" cmake \
  -G 'Unix Makefiles' \
  -DCMAKE_CXX_COMPILER=/opt/intel/oneapi/compiler/latest/linux/bin/icpx \
  -DoneDPL_ROOT=/opt/intel/oneapi/dpl/latest \
  -DoneDPL_DIR=/opt/intel/oneapi/dpl/latest/lib/cmake/oneDPL \
  -DTBB_ROOT=/opt/intel/oneapi/tbb/latest \
  -DTBB_DIR=/opt/intel/oneapi/tbb/latest/lib/cmake/tbb \
  -DBOOST_ROOT=~/local/boost/ \
  --log-level=VERBOSE \
  -Dalpaka_DEBUG=2 \
  -Dalpaka_BUILD_EXAMPLES=ON \
  -Dalpaka_CHECK_HEADERS=ON \
  -DBUILD_TESTING=ON \
  -DCMAKE_BUILD_TYPE=Debug \
  -DCMAKE_VERBOSE_MAKEFILE=ON \
  -Dalpaka_ACC_CPU_B_SEQ_T_SEQ_ENABLE=ON \
  -Dalpaka_ACC_SYCL_ENABLE=ON \
  -Dalpaka_SYCL_PLATFORM_ONEAPI=ON \
  -Dalpaka_SYCL_ONEAPI_CPU=OFF \
  -Dalpaka_SYCL_ONEAPI_GPU=ON \
  -Dalpaka_SYCL_ONEAPI_GPU_DEVICES='intel_gpu_pvc' \
  -Dalpaka_DISABLE_VENDOR_RNG=ON \
  -L \
  ../../

$ make -j8 -k

$ make test

memBufTest

zeroDimBufferTest fails to build with

$ make
...
[ 80%] Building CXX object test/integ/zeroDimBuffer/CMakeFiles/zeroDimBufferTest.dir/src/zeroDimBuffer.cpp.o
cd /home/u106132/alpaka/build/sycl_gpu/test/integ/zeroDimBuffer && /opt/intel/oneapi/compiler/latest/linux/bin/icpx -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_SYCL_ENABLED -DALPAKA_BLOCK_SHARED_DYN_MEMBER_ALLOC_KIB=47 -DALPAKA_DEBUG=2 -DALPAKA_DISABLE_VENDOR_RNG -DALPAKA_OFFLOAD_MAX_BLOCK_SIZE="" -DALPAKA_SYCL_ONEAPI_GPU -DALPAKA_SYCL_TARGET_GPU -DBOOST_ATOMIC_DYN_LINK -DBOOST_ATOMIC_NO_LIB -I/home/u106132/alpaka/include -isystem /home/u106132/alpaka/thirdParty/catch2/src/catch2/.. -isystem /home/u106132/alpaka/build/sycl_gpu/thirdParty/catch2/generated-includes -isystem /home/u106132/local/boost/include -g -O2 -g -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-global-constructors -Wno-padded -Wno-extra-semi-stmt -ffp-model=precise -Wno-disabled-macro-expansion -Wno-unsafe-buffer-usage -O0 -fsycl -sycl-std=2020 -fsycl-targets=intel_gpu_pvc -fsycl-unnamed-lambda -MD -MT test/integ/zeroDimBuffer/CMakeFiles/zeroDimBufferTest.dir/src/zeroDimBuffer.cpp.o -MF CMakeFiles/zeroDimBufferTest.dir/src/zeroDimBuffer.cpp.o.d -o CMakeFiles/zeroDimBufferTest.dir/src/zeroDimBuffer.cpp.o -c /home/u106132/alpaka/test/integ/zeroDimBuffer/src/zeroDimBuffer.cpp
In file included from /home/u106132/alpaka/test/integ/zeroDimBuffer/src/zeroDimBuffer.cpp:9:
In file included from /home/u106132/alpaka/include/alpaka/alpaka.hpp:13:
In file included from /home/u106132/alpaka/include/alpaka/acc/AccCpuOmp2Blocks.hpp:24:
In file included from /home/u106132/alpaka/include/alpaka/workdiv/WorkDivMembers.hpp:8:
/home/u106132/alpaka/include/alpaka/extent/Traits.hpp:92:5: error: static assertion failed due to requirement 'integral_constant<unsigned long, 0>::value >= 1'
    static_assert(Dim<TExtent>::value >= 1);
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~

parallelLoopPatterns

parallelLoopPatterns fails at runtime with an assertion:

$ ./example/parallelLoopPatterns/parallelLoopPatterns
...
parallelLoopPatterns: /home/u106132/alpaka/include/alpaka/mem/buf/sycl/Set.hpp:53: alpaka::detail::TaskSetSyclBase<std::integral_constant<unsigned long, 1>, alpaka::BufGenericSycl<float, std::integral_constant<unsigned long, 1>, unsigned int, alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, alpaka::Vec<std::integral_constant<unsigned long, 1>, unsigned int>>::TaskSetSyclBase(TViewFwd &&, const std::uint8_t &, const TExtent &) [TDim = std::integral_constant<unsigned long, 1>, TView = alpaka::BufGenericSycl<float, std::integral_constant<unsigned long, 1>, unsigned int, alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, TExtent = alpaka::Vec<std::integral_constant<unsigned long, 1>, unsigned int>, TViewFwd = alpaka::BufGenericSycl<float, std::integral_constant<unsigned long, 1>, unsigned int, alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>> &]: Assertion `m_extentWidthBytes <= m_dstPitchBytes[TDim::value - 1u]' failed.
Aborted (core dumped)

bufSlicingTest

bufSlicingTest fails at runtime with an assertion:

$ ./test/unit/mem/copy/bufSlicingTest
...
bufSlicingTest: /home/u106132/alpaka/include/alpaka/mem/buf/sycl/Set.hpp:53: alpaka::detail::TaskSetSyclBase<std::integral_constant<unsigned long, 1>, alpaka::ViewSubView<alpaka::DevGenericSycl<alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, int, std::integral_constant<unsigned long, 1>, long>, alpaka::Vec<std::integral_constant<unsigned long, 1>, long>>::TaskSetSyclBase(TViewFwd &&, const std::uint8_t &, const TExtent &) [TDim = std::integral_constant<unsigned long, 1>, TView = alpaka::ViewSubView<alpaka::DevGenericSycl<alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, int, std::integral_constant<unsigned long, 1>, long>, TExtent = alpaka::Vec<std::integral_constant<unsigned long, 1>, long>, TViewFwd = alpaka::ViewSubView<alpaka::DevGenericSycl<alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, int, std::integral_constant<unsigned long, 1>, long> &]: Assertion `m_extentWidthBytes <= m_dstPitchBytes[TDim::value - 1u]' failed.
-------------------------------------------------------------------------------
memBufSlicingMemsetTest - TestAccWithDataTypes - 1
-------------------------------------------------------------------------------
/home/u106132/alpaka/test/unit/mem/copy/src/BufSlicing.cpp:170
...............................................................................

/home/u106132/alpaka/test/unit/mem/copy/src/BufSlicing.cpp:170: FAILED:
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal

===============================================================================
test cases:   74 |   49 passed | 25 failed
assertions: 3105 | 3080 passed | 25 failed

Aborted (core dumped)

memViewTest

memViewTest fails at runtime with the same assertion:

$ ./test/unit/mem/view/memViewTest
...
memViewTest: /home/u106132/alpaka/include/alpaka/mem/buf/sycl/Set.hpp:53: alpaka::detail::TaskSetSyclBase<std::integral_constant<unsigned long, 1>, alpaka::ViewPlainPtr<alpaka::DevGenericSycl<alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, float, std::integral_constant<unsigned long, 1>, long>, alpaka::Vec<std::integral_constant<unsigned long, 1>, long>>::TaskSetSyclBase(TViewFwd &&, const std::uint8_t &, const TExtent &) [TDim = std::integral_constant<unsigned long, 1>, TView = alpaka::ViewPlainPtr<alpaka::DevGenericSycl<alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, float, std::integral_constant<unsigned long, 1>, long>, TExtent = alpaka::Vec<std::integral_constant<unsigned long, 1>, long>, TViewFwd = alpaka::ViewPlainPtr<alpaka::DevGenericSycl<alpaka::PlatformGenericSycl<alpaka::detail::IntelGpuSelector>>, float, std::integral_constant<unsigned long, 1>, long> &]: Assertion `m_extentWidthBytes <= m_dstPitchBytes[TDim::value - 1u]' failed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
memViewTest is a Catch2 v3.3.2 host application.
Run with -? for options

-------------------------------------------------------------------------------
viewPlainPtrTest - alpaka::test::TestAccs - 1
-------------------------------------------------------------------------------
/home/u106132/alpaka/test/unit/mem/view/src/ViewPlainPtrTest.cpp:97
...............................................................................

/home/u106132/alpaka/test/unit/mem/view/src/ViewPlainPtrTest.cpp:97: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal

===============================================================================
test cases:   26 |   25 passed | 1 failed
assertions: 1494 | 1493 passed | 1 failed

Aborted (core dumped)

randomCells2D

randomCells2D fails at runtime:

$ ./example/randomCells2D/randomCells2D
...
Number of cells: 26797
Number of calculations per cell: 256
Total number of calculations: 6860032
Mean value A: 0.00628226 (should converge to 0.5)
Mean value B: 0.00628699 (should converge to 0.5)
Maximum error expected at 6860032 calculations should be around 0.0001909
Convergence test failed!
...

$ echo $?
1

matMulTest

matMulTest fails at runtime:

$ ./test/integ/matMul/matMulTest
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
matMulTest is a Catch2 v3.3.2 host application.
Run with -? for options

-------------------------------------------------------------------------------
matMul - TestAccs - 1
-------------------------------------------------------------------------------
/home/u106132/alpaka/test/integ/matMul/src/matMul.cpp:153
...............................................................................

/home/u106132/alpaka/test/integ/matMul/src/matMul.cpp:306: FAILED:
  REQUIRE( resultCorrect )
with expansion:
  false

[+] ~BufCpuImpl
[-] ~BufCpuImpl
===============================================================================
test cases: 2 | 1 passed | 1 failed
assertions: 8 | 7 passed | 1 failed

warpTest

warpTest fails at runtime:

$ ./test/unit/warp/warpTest
...
-------------------------------------------------------------------------------
shfl - alpaka::test::TestAccs - 23
-------------------------------------------------------------------------------
/home/u106132/alpaka/test/unit/warp/src/Shfl.cpp:97
...............................................................................

/home/u106132/alpaka/test/unit/warp/src/Shfl.cpp:134: FAILED:
  REQUIRE( fixture(ShflMultipleThreadWarpTestKernel<16>{}) )
with expansion:
  false

===============================================================================
test cases: 144 |  84 passed |  60 failed
assertions: 732 | 108 passed | 624 failed

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Comments: 22 (22 by maintainers)

Commits related to this issue

Most upvoted comments

100% tests passed, 0 tests failed out of 29 😃

The current HEAD is better, but still has issues in debug mode (tested on Ponte Vecchio gpu):

95% tests passed, 2 tests failed out of 43

Total Test time (real) =  78.49 sec

The following tests FAILED:
         32 - memBufTest (Not Run)
         42 - warpTest (Failed)

Coming up: #2125. The CI says the changes compile for the SYCL backend. I have not run any tests though. Please give it a try, thx!