cmssw: HLT crashes in run 359297 from module `EcalRecHitProducer:hltEcalRecHitWithoutTPs`

Several HLT jobs crashed during run 359297, all due to module EcalRecHitProducer:hltEcalRecHitWithoutTPs. Before crashing, the following error message appears: cmsRun: /.../cmssw/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_9/src/CalibCalorimetry/EcalLaserAnalyzer/src/MEEBGeom.cc:31: static int MEEBGeom::sm(MEEBGeom::EBGlobalCoord, MEEBGeom::EBGlobalCoord): Assertion ``ieta > 0 && ieta <= 85' failed. The full log output for several such cases, including the stack trace, can be found on EOS: /eos/cms/store/user/trocino/HLT_ECAL_Debug/LogOutput/ ROOT RAW files containing all incriminated events can be found at /eos/cms/store/user/trocino/HLT_ECAL_Debug/EdmRawRoot/

Please note that the error does not seem to be reproducible on LXPLUS (probably because it runs on CPUs), while it’s reproducible on machines with GPUs, e.g. Hilton machines.

A recipe to reproduce the errors:

cmsrel CMSSW_12_4_9
cd CMSSW_12_4_9/src
cmsenv
hltGetConfiguration  run:359297  --globaltag 124X_dataRun3_HLT_v4  --process HLT  --data  --unprescale  --input /store/user/trocino/HLT_ECAL_Debug/EdmRawRoot/run359297_ls0232_index000269_fu-c2b05-14-01_pid3023154.root  --output all  > hlt.py
cmsRun hlt.py

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 31 (30 by maintainers)

Most upvoted comments

Hi @missirol , this last instance is likely caused by a tower in EB-01 that has data integrity problems. It is mostly contained in one tower which could be masked as a short term solution if needed. See also slide 7 of last week’s ECAL PFG shifter report https://indico.cern.ch/event/1288622/contributions/5414918/attachments/2650937/4590074/PFG_week_20_report_Orlandi.pdf