OpenBLAS: DGEMM: illegal instruction on old x86 processor built with DYNAMIC_ARCH
This is a really old system (AMD K10, from 2007), but with the release of 0.3.13, OpenBLAS with the DYNAMIC_ARCH=1 (built on a newer CPU) is now failing inside OpenBLAS DGEMV:
Program received signal SIGILL, Illegal instruction.
0x00002aaaab42040f in dgemv_ ()
from /projects/spack/var/spack/environments/scale/.spack-env/view/lib/libopenblas.so.0
This was built using GCC 8.3 using binutils 2.35.1 on RHEL6. I confirmed that 0.3.12 works in the same setup.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 19 (5 by maintainers)
Commits related to this issue
- openblas: prevent microarch flags for generic targets As of OpenBLAS 0.3.13, leaving off `TARGET` by default optimizes most code for the host system -- adding flags that cause the resulting library t... — committed to sethrj/spack by sethrj 3 years ago
- openblas: prevent microarch flags for generic targets (#22270) As of OpenBLAS 0.3.13, leaving off `TARGET` by default optimizes most code for the host system -- adding flags that cause the resulting... — committed to spack/spack by sethrj 3 years ago
- openblas: prevent microarch flags for generic targets (#22270) As of OpenBLAS 0.3.13, leaving off `TARGET` by default optimizes most code for the host system -- adding flags that cause the resulting... — committed to eugeneswalker/spack by sethrj 3 years ago
- openblas: choose generic target for x86 and x86_64. The current template uses DYNAMIC_ARCH=1 to get multiple versions of some functions to support different cpus. However, it will pick up the cpu it ... — committed to gt7-void/void-packages by gt7-void 3 years ago
- openblas: choose generic target for x86 and x86_64. The current template uses DYNAMIC_ARCH=1 to get multiple versions of some functions to support different cpus. However, it will pick up the cpu it ... — committed to void-linux/void-packages by gt7-void 3 years ago
- openblas: choose generic target for x86 and x86_64. The current template uses DYNAMIC_ARCH=1 to get multiple versions of some functions to support different cpus. However, it will pick up the cpu it ... — committed to hazayan/void-packages by gt7-void 3 years ago
- Fix: OpenBLAS build-parameters previously worked by accident. Newest version of OpenBLASexposes this problem. The solution was to change the build-parameters. See https://github.com/xianyi/OpenBLAS/i... — committed to Ultimaker/cura-build-environment by rburema 3 years ago
Yes, that should work. (Not sure why you were not seeing the output - both the compiler version info and the “minimum required cpu” were added in March 2020 for the 0.3.10 release.)
Right, this mostly used to work “by accident” but the problem has been aggravated by the recent introduction of “universal intrinsics” into cpu-specific BLAS kernels, which require the respective compiler flags to “unlock” the functionality in the gcc headers.
Your suggestion does have some merit, given how many got bitten by this (although it has been mentioned in the readme for some time, and the minimum required target for the library also gets printed at the end of the build process - btw is there a specific reason why you are cherrypicking individual subtargets of
makeinstead of running a complete build ?)FWIW, I encountered this issue on an
Intel(R) Core(TM) i5-6440HQ CPU @ 2.60GHz, which was released in September 2015 (details at https://github.com/daanzu/kaldi-active-grammar/issues/50). So the problem is by no means limited to “ancient” CPUs.Switching the OpenBLAS build to
TARGET=GENERICfixed it, but it’s clear that this shouldn’t be happening in the first place.Not sure I understand that comment about future glibc changes… not sure either if defaulting to the build host in dynamic_arch was ever a conscious decision, just one that people used to live with (and that did not have as grave consequences until fairly recently). In any case not really something I want to change before the overdue 0.3.14 release.
Decades ago build systems were unpatched nearly decommissioned hosts, so assumption of using build host as the generic one worked just fine. It should be one-liner to clamp shared code to generic, but thinking about upcoming glibc hwcaps it will soon need sub-setting of compute kernels, not sure if that should be left downstream or polished here.