oneDNN: `test_benchdnn...` failures on AArch64 builds.

Summary

Two of the standard test_benchdnn tests have started failing on AArch64, in both the master and release branches. These failures were not caught by the DroneCI runs, which only run the gtest subset. This means that the 2.3 release is currently failing on AArch64, but I’m not yet sure why. I’ve added details below; do you have any suggestions? do you think this is something that we can resolve for the 2.3 release?

The failing tests are:

The following tests FAILED:
	111 - test_benchdnn_gru_ci_cpu (Failed)
	112 - test_benchdnn_lstm_ci_cpu (Failed)
Errors while running CTest

These failures were not observed when preparing #1103 as the commit which introduced them was not in our downstream branch at the time.

Version

The first failing commit appears to be: https://github.com/oneapi-src/oneDNN/commit/79be1a9330ed2e70abbdf5857c8dd0bf39dceb1e

Environment

  • CPU make and model: Arm AArch64, Neoverse-N1
  • OS version: Ubuntu 20.04 and RHEL 8.x
  • Compiler version: gcc 7, 8, 9, & 10. Clang 9.

Steps to reproduce

From the cloned oneDNN dir:

mkdir build
cd build
cmake -DDNNL_CPU_RUNTIME=OMP -DDNNL_BUILD_FOR_CI=ON ../. && make -j 32
ctest -R test_benchdnn_gru_ci_cpu  --verbose
cd ..

Observed behavior

...
110: tests:10240 passed:1280 skipped:8320 mistrusted:0 unimplemented:640 failed:640 listed:0
1/1 Test #110: test_benchdnn_gru_ci_cpu .........***Failed  531.62 sec

0% tests passed, 1 tests failed out of 1

There are 640 tests reporting ‘UNIMPLEMENTED’ which were previously skipped.

Expected behavior

All benchdnn tests should pass. Prior to 79be1a9330ed2e70abbdf5857c8dd0bf39dceb1e, ctest -R test_benchdnn_gru_ci_cpu --verbose gave:

110: tests:10240 passed:1280 skipped:8960 mistrusted:0 unimplemented:0 failed:0 listed:0
1/1 Test #110: test_benchdnn_gru_ci_cpu .........   Passed  536.62 sec

The following tests passed:
        test_benchdnn_gru_ci_cpu

100% tests passed, 0 tests failed out of 1

Work-around - removing the include of platform.hpp from test_isa_common.hpp

diff --git a/tests/test_isa_common.hpp b/tests/test_isa_common.hpp
index 9599f7d45..c6b5d506f 100644
--- a/tests/test_isa_common.hpp
+++ b/tests/test_isa_common.hpp
@@ -29,7 +29,6 @@
 #include "oneapi/dnnl/dnnl.h"
 #include "oneapi/dnnl/dnnl.hpp"
 
-#include "src/cpu/platform.hpp"
 
 #if DNNL_X64
 #include "src/cpu/x64/cpu_isa_traits.hpp"

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (15 by maintainers)

Most upvoted comments

Thanks @igorsafo, @vpirogov - just to confirm, everything looks good for all our builds now!