cmssw: HLT Farm crashes in run 378940
Report the large numbers of GPU-related HLT crashes yesterday night (elog)
- Related to illegal memory access
- Not fully understood as HLT menus were unchanged with respect to the previous runs
Here’s the recipe how to reproduce the crashes. (tested with CMSSW_14_0_4
on lxplus8-gpu
)
cmsrel CMSSW_14_0_4
cd CMSSW_14_0_4/src
cmsenv
https_proxy=http://cmsproxy.cms:3128 hltConfigFromDB --runNumber 378940 > hlt_run378940.py
cat <<@EOF >> hlt_run378940.py
from EventFilter.Utilities.EvFDaqDirector_cfi import EvFDaqDirector as _EvFDaqDirector
process.EvFDaqDirector = _EvFDaqDirector.clone(
buBaseDir = '/eos/cms/store/group/phys_muon/wjun/error_stream',
runNumber = 378940
)
from EventFilter.Utilities.FedRawDataInputSource_cfi import source as _source
process.source = _source.clone(
fileListMode = True,
fileNames = (
'/eos/cms/store/group/phys_muon/wjun/error_stream/run378940/run378940_ls0021_index000036_fu-c2b02-31-01_pid1363776.raw',
)
)
process.options.wantSummary = True
process.options.numberOfThreads = 1
process.options.numberOfStreams = 0
@EOF
mkdir run378940
cmsRun hlt_run378940.py &> crash_run378940.log
About this issue
- Original URL
- State: open
- Created 3 months ago
- Comments: 24 (24 by maintainers)
type pf
assign hlt, heterogeneous