cmssw: [SKYLAKEAVX512] Segfault running edmCheckClassVersion.py on host with skylake-avx512 microarch
On my desktop for an 11th generation CoreI7 which has the skylake-avx512 microarch, I ran a build with debug symbols enabled CXXFLAGS+=-g -fno-omit-frame-pointer. In the DataFormat/TauReco package the edmCheckClassVersion.py script segfaults and suggests running scram b updateclassversion. Running this command also results in a segfault but does produce an updated class version and checksum. The path set bySCRAM_ARCH=auto picks the skylake-avx512 subdirectory. When the dictionary library is loaded along with the microarch specific library, ROOT gets a different checksum for the class and segfaults.
I tried removing the debug symbols flag for DataFormats/TauReco with <flags REM_CXXFLAGS="-g -fno-omit-frame-pointer"/> which seemed to fix the segfault.
I then get an segfault with edmCheckClassVersion.py in a different package.
The solution I am trying is to run scram b SCRAM_TARGET=haswell in the hopes that setting the lower microarch and corresponding path will results in the correct class checksum.
It does not happen on a node with an AMD EPYC which which scram auto selects as haswell
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 44 (44 by maintainers)
Commits related to this issue
- prefix multi-arch lib directory with scram; see cms-sw/cmssw#39286 — committed to cms-sw/cmssw-config by smuzaffar 2 years ago
- [BuildRules] prefix multi-arch lib directories with scram_ see https://github.com/cms-sw/cmssw/issues/39286 issue — committed to cms-sw/cmsdist by smuzaffar 2 years ago
New version of scram
V3_0_52should fix this issue. Now at build time (scram build), new version properly set the env to point to default libs instead of one of microarch paths.@gartung , can you please test again one of the SKYLAKE IBs ?