cmssw: HLT crash in run-374803 (`HcalUnpacker::unpackUTCA`)

In run-374803 (HI collisions, release CMSSW_13_2_5_patch1), DAQ reported a CMSSW crash at HLT not seen previously, to my knowledge [link to HLT elog]. A piece of stack trace which is possibly relevant is in [1]. Once the corresponding error-stream files become available, we’ll attempt to reproduce offline the crash.

FYI: @cms-sw/hlt-l2 @fwyzard @mzarucki @trtomei @trocino

[1]

A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Thu Oct 5 19:38:28 CEST 2023
Thread 19 (Thread 0x7f8268ffe700 (LWP 1202271) "cmsRun"):
#0 0x00007f8345c61a71 in poll () from /lib64/libc.so.6
#1 0x00007f833baf8d2f in full_read.constprop () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#2 0x00007f833bac075c in edm::service::InitRootHandlers::stacktraceFromThread() () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#3 0x00007f833bac11bb in sig_dostack_then_abort () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#4 <signal handler called>
#5 0x00007f82eefd290a in HcalUHTRData::const_iterator::operator++() () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libEventFilterHcalRawToDigi.so
#6 0x00007f82eefd7eac in HcalUnpacker::unpackUTCA(FEDRawData const&, HcalElectronicsMap const&, HcalUnpacker::Collections&, HcalUnpackerReport&, bool) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libEventFilterHcalRawToDigi.so
#7 0x00007f82ca8f3709 in HcalRawToDigi::produce(edm::Event&, edm::EventSetup const&) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginEventFilterHcalRawToDigiPlugins.so
#8 0x00007f834868b3ed in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#9 0x00007f8348671b52 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#10 0x00007f83485fc5aa in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits::Context const*) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#11 0x00007f83485fca58 in edm::Worker::RunModuleTask<edm::OccurrenceTraits::execute() () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#12 0x00007f834856ea8f in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#13 0x00007f8346def2e4 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x7f825fb15000, waiter=..., this=0x7f8340b7e200) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:322
#14 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x7f8340b7e200) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:458
#15 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/arena.cpp:137
#16 tbb::detail::r1::market::process (this=<optimized out>, j=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/market.cpp:599
#17 0x00007f8346df14a6 in tbb::detail::r1::rml::private_worker::run (this=0x7f8340b73a80) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:271
#18 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x7f8340b73a80) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:221
#19 0x00007f8345f3f17a in start_thread () from /lib64/libpthread.so.0
#20 0x00007f8345c6cdf3 in clone () from /lib64/libc.so.6
[ message truncated - showing only crashed thread ] 

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 21 (16 by maintainers)

Most upvoted comments

For the record there was yet another occurrence in run-375055, see link to HLT e-log. The relevant part of the stack trace appears to be similar [1]. I attach for completeness the full stack trace from F3Mon: f3mon_logtable_2023-10-13T06 47 59.369Z.txt

[1]

A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Thu Oct 12 18:57:35 CEST 2023
Thread 12 (Thread 0x7f7efbdff700 (LWP 3485309) "cmsRun"):
#0 0x00007f7f7ba0ea71 in poll () from /lib64/libc.so.6
#1 0x00007f7f7096cd2f in full_read.constprop () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#2 0x00007f7f7093475c in edm::service::InitRootHandlers::stacktraceFromThread() () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#3 0x00007f7f709351bb in sig_dostack_then_abort () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#4 <signal handler called>
#5 0x00007f7f2553b90a in HcalUHTRData::const_iterator::operator++() () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libEventFilterHcalRawToDigi.so
#6 0x00007f7f25540ec9 in HcalUnpacker::unpackUTCA(FEDRawData const&, HcalElectronicsMap const&, HcalUnpacker::Collections&, HcalUnpackerReport&, bool) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libEventFilterHcalRawToDigi.so
#7 0x00007f7f00e5c709 in HcalRawToDigi::produce(edm::Event&, edm::EventSetup const&) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/pluginEventFilterHcalRawToDigiPlugins.so
#8 0x00007f7f7e4383ed in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#9 0x00007f7f7e41eb52 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#10 0x00007f7f7e3a95aa in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits::Context const*) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#11 0x00007f7f7e3a9a58 in edm::Worker::RunModuleTask<edm::OccurrenceTraits::execute() () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#12 0x00007f7f7e31ba8f in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /opt/offline/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_5/lib/el8_amd64_gcc11/libFWCoreFramework.so
#13 0x00007f7f7cb9c2e4 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x7f7e71576900, waiter=..., this=0x7f7f7692f700) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:322
#14 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x7f7f7692f700) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:458
#15 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/arena.cpp:137
#16 tbb::detail::r1::market::process (this=<optimized out>, j=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/market.cpp:599
#17 0x00007f7f7cb9e4a6 in tbb::detail::r1::rml::private_worker::run (this=0x7f7f76923e00) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:271
#18 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x7f7f76923e00) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_13_2_0_pre3-el8_amd64_gcc11/build/CMSSW_13_2_0_pre3-build/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:221
#19 0x00007f7f7bcec17a in start_thread () from /lib64/libpthread.so.0
#20 0x00007f7f7ba19df3 in clone () from /lib64/libc.so.6
[ message truncated - showing only crashed thread ] 

PRs addressing the issue (amendments suggested by Jeremy Mans) submitted https://github.com/cms-sw/cmssw/pull/43011 (master) https://github.com/cms-sw/cmssw/pull/43012 (132X)

type hcal