cmssw: Out of range exception from RPCAMCRawToDigi

Workflow 136.8561 step 3 has been failing since CMSSW_12_5_X_2022-07-28-1100 with

----- Begin Fatal Exception 02-Aug-2022 14:38:05 CEST-----------------------
An exception of category 'OutOfRange' occurred while
   [0] Processing  Event run: 314890 lumi: 591 event: 497483740 stream: 2
   [1] Running path 'dqmoffline_step'
   [2] Prefetching for module L1TdeStage2CPPF/'l1tdeStage2Cppf'
   [3] Calling method for module RPCAMCRawToDigi/'rpcCPPFRawToDigi'
Exception Message:
Out-of-range input for RPCAMCLink::bf_set, position 0: 100
----- End Fatal Exception -------------------------------------------------

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc10/CMSSW_12_5_X_2022-08-02-1100/pyRelValMatrixLogs/run/136.8561_RunZeroBias_hBStarTk+RunZeroBias_hBStarTk+HLTDR2_2018_hBStar+RECODR2_2018reHLT_Offline_hBStar+HARVEST2018_hBStar/step3_RunZeroBias_hBStarTk+RunZeroBias_hBStarTk+HLTDR2_2018_hBStar+RECODR2_2018reHLT_Offline_hBStar+HARVEST2018_hBStar.log#/

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 33 (33 by maintainers)

Most upvoted comments

When the final 12_5_0 build is scheduled - 20.09 or earlier?

Sep 20, see https://twiki.cern.ch/twiki/bin/viewauth/CMS/CMSSW_12_5_0 However, I would aim for a fix well before that date, in order to have still a few IBs available in which the wf can be tested

The CPPF digis are called by the L1TStage2Emulator. But I think at that moment (2018A) L1 didn’t use them and just produced the cppf clusters on flight using the RPCDigis on the emulation step. (CPPF concentrates rpc digis in the endcap and clusterize them) So, I guess there will be no problem the CPPFRPCunpacker to be removed for this particular workflow.

by the way removing the CPPF unpacker results in

----- Begin Fatal Exception 23-Aug-2022 18:56:12 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 314890 lumi: 591 event: 497757635 stream: 0
   [1] Running path 'dqmoffline_step'
   [2] Calling method for module L1TStage2CPPF/'l1tStage2Cppf'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: std::vector<l1t::CPPFDigi>
Looking for module label: rpcCPPFRawToDigi
Looking for productInstanceName: 

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------

so, there are downstream consumers.

@kskovpen

Anyhow, if experts could comment on how relevant this wf is for Run3 (as it stands now), it would help to decide on a proper action.

this wf has no relevance whatsoever for run-3, but it is there to ensure we can still reconstruct properly the run2 high beta star data. I think someone with higher paygrade than me should decide if this is something that CMS wants to keep being able to do, but I don’t see why that would not be the case. Having said that to me it seems that the right course of action is to provide these checks in the CPPF / RPC code in order to avoid crashing on bad input data. Such checks are customarily included in DPG / POG code to avoid to failures at run time.

@mileva

In fact I tried to explain the reason for the crash with this workflow - namely the cppf data were corrupted in the input run, and the reason for the crash is not in the proposed pull request, but the data.

yes, I understand, but changing the input data is NOT an option, unless we want to give up testing the high beta* reco…