cmssw: [SKYLAKEAVX512] Segfault running edmCheckClassVersion.py on host with skylake-avx512 microarch

On my desktop for an 11th generation CoreI7 which has the skylake-avx512 microarch, I ran a build with debug symbols enabled CXXFLAGS+=-g -fno-omit-frame-pointer. In the DataFormat/TauReco package the edmCheckClassVersion.py script segfaults and suggests running scram b updateclassversion. Running this command also results in a segfault but does produce an updated class version and checksum. The path set bySCRAM_ARCH=auto picks the skylake-avx512 subdirectory. When the dictionary library is loaded along with the microarch specific library, ROOT gets a different checksum for the class and segfaults.

I tried removing the debug symbols flag for DataFormats/TauReco with <flags REM_CXXFLAGS="-g -fno-omit-frame-pointer"/> which seemed to fix the segfault.

I then get an segfault with edmCheckClassVersion.py in a different package. The solution I am trying is to run scram b SCRAM_TARGET=haswell in the hopes that setting the lower microarch and corresponding path will results in the correct class checksum.

It does not happen on a node with an AMD EPYC which which scram auto selects as haswell

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 44 (44 by maintainers)

Commits related to this issue

Most upvoted comments

New version of scram V3_0_52 should fix this issue. Now at build time (scram build), new version properly set the env to point to default libs instead of one of microarch paths.

@gartung , can you please test again one of the SKYLAKE IBs ?