highway: x86: HwyMathTestGroup/HwyMathTest.TestAllAtanh/EMU128 failure
I see a tst failure on Debian/x86 (32bits) on the floating point comparison:
Truncated:
[...]
814/894 Test #814: HwyMathTestGroup/HwyMathTest.TestAllAtanh/EMU128 # GetParam() = 2305843009213693952 ...................................Subprocess aborted***Exception: 0.03 sec
Running main() from ./googletest/src/gtest_main.cc
Note: Google Test filter = HwyMathTestGroup/HwyMathTest.TestAllAtanh/EMU128
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HwyMathTestGroup/HwyMathTest
[ RUN ] HwyMathTestGroup/HwyMathTest.TestAllAtanh/EMU128
f32x4: Atanh max_ulp 2
f32x2: Atanh(0.000000) expected 0.000000 actual 0.000000 ulp 5.20158e+08 max ulp 4
f32x2: Atanh(0.000000) expected 0.000000 actual 0.000000 ulp 5.20424e+08 max ulp 4
f32x2: Atanh(0.000000) expected 0.000000 actual 0.000000 ulp 5.20691e+08 max ulp 4
f32x2: Atanh(0.000000) expected 0.000000 actual 0.000000 ulp 5.20957e+08 max ulp 4
f32x2: Atanh(0.000000) expected 0.000000 actual 0.000000 ulp 5.21223e+08 max ulp 4
[...]
ref:
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 19 (19 by maintainers)
Commits related to this issue
- fix math_test on 32-bit GCC (compiler bug). Refs #1488 PiperOrigin-RevId: 546857118 — committed to google/highway by jan-wassenberg a year ago
- fix math_test on 32-bit GCC (compiler bug). Refs #1488 PiperOrigin-RevId: 546857118 — committed to google/highway by jan-wassenberg a year ago
- fix math_test on 32-bit GCC (compiler bug). Refs #1488 PiperOrigin-RevId: 546857118 — committed to google/highway by jan-wassenberg a year ago
- fix math_test on 32-bit GCC (compiler bug). Refs #1488 PiperOrigin-RevId: 546870548 — committed to google/highway by jan-wassenberg a year ago
- better workaround for 32-bit excess precision. Refs #1488 PiperOrigin-RevId: 547173825 — committed to google/highway by jan-wassenberg a year ago
- better workaround for 32-bit excess precision. Fixes #1488 PiperOrigin-RevId: 547173825 — committed to google/highway by jan-wassenberg a year ago
- better workaround for 32-bit excess precision. Fixes #1488 PiperOrigin-RevId: 547173825 — committed to google/highway by jan-wassenberg a year ago
- better workaround for 32-bit excess precision. Fixes #1488 PiperOrigin-RevId: 547173825 — committed to google/highway by jan-wassenberg a year ago
- Allow opting in to SSE2 excess precision workaround. Refs #1488 Thanks @malaterre for the suggestion. Also disables math_test rather than fail when CFLAGS not set. PiperOrigin-RevId: 547445726 — committed to google/highway by jan-wassenberg a year ago
- Allow opting in to SSE2 excess precision workaround. Refs #1488 Thanks @malaterre for the suggestion. Also disables math_test rather than fail when CFLAGS not set. PiperOrigin-RevId: 547445726 — committed to google/highway by jan-wassenberg a year ago
- Allow opting in to SSE2 excess precision workaround. Refs #1488 Thanks @malaterre for the suggestion. Also disables math_test rather than fail when CFLAGS not set. PiperOrigin-RevId: 547445726 — committed to google/highway by jan-wassenberg a year ago
- Allow opting in to SSE2 excess precision workaround. Refs #1488 Thanks @malaterre for the suggestion. Also disables math_test rather than fail when CFLAGS not set. PiperOrigin-RevId: 547445726 — committed to google/highway by jan-wassenberg a year ago
- Allow opting in to SSE2 excess precision workaround. Refs #1488 Thanks @malaterre for the suggestion. Also disables math_test rather than fail when CFLAGS not set. PiperOrigin-RevId: 547445726 — committed to google/highway by jan-wassenberg a year ago
- Allow opting in to SSE2 excess precision workaround. Refs #1488 Thanks @malaterre for the suggestion. Also disables math_test rather than fail when CFLAGS not set. PiperOrigin-RevId: 547465224 — committed to google/highway by jan-wassenberg a year ago
- better workaround for 32-bit excess precision. Fixes #1488 PiperOrigin-RevId: 547190667 — committed to asdlei99/highway by jan-wassenberg a year ago
For reference:
You should be able to reproduce it using GCC 11/12/13 (issue arise on different vector size though). Either force EMU128 or simply use the default SCALAR.
Clearly a bug in GCC, but I failed to find time to provide a minimal test case. As explained above as soon as you extract the math logic the compiler is able optimize the code correctly. So the issue is somewhere in the template hierarchy and/or the for loop which confuses the optimizer step. I should be able to find some time end of this week (hopefully).