cmssw: HLT crashes in run 359297 from module `EcalRecHitProducer:hltEcalRecHitWithoutTPs`
Several HLT jobs crashed during run 359297, all due to module EcalRecHitProducer:hltEcalRecHitWithoutTPs
. Before crashing, the following error message appears:
cmsRun: /.../cmssw/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_9/src/CalibCalorimetry/EcalLaserAnalyzer/src/MEEBGeom.cc:31: static int MEEBGeom::sm(MEEBGeom::EBGlobalCoord, MEEBGeom::EBGlobalCoord): Assertion ``ieta > 0 && ieta <= 85' failed.
The full log output for several such cases, including the stack trace, can be found on EOS:
/eos/cms/store/user/trocino/HLT_ECAL_Debug/LogOutput/
ROOT RAW files containing all incriminated events can be found at
/eos/cms/store/user/trocino/HLT_ECAL_Debug/EdmRawRoot/
Please note that the error does not seem to be reproducible on LXPLUS (probably because it runs on CPUs), while it’s reproducible on machines with GPUs, e.g. Hilton machines.
A recipe to reproduce the errors:
cmsrel CMSSW_12_4_9
cd CMSSW_12_4_9/src
cmsenv
hltGetConfiguration run:359297 --globaltag 124X_dataRun3_HLT_v4 --process HLT --data --unprescale --input /store/user/trocino/HLT_ECAL_Debug/EdmRawRoot/run359297_ls0232_index000269_fu-c2b05-14-01_pid3023154.root --output all > hlt.py
cmsRun hlt.py
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 31 (30 by maintainers)
Hi @missirol , this last instance is likely caused by a tower in EB-01 that has data integrity problems. It is mostly contained in one tower which could be masked as a short term solution if needed. See also slide 7 of last week’s ECAL PFG shifter report https://indico.cern.ch/event/1288622/contributions/5414918/attachments/2650937/4590074/PFG_week_20_report_Orlandi.pdf