cmssw: Segfault in gbl::GblTrajectory::prepare()
[Find reproducible example below] Stack trace from CMSSW_14_0_0 caused by gbl::GblTrajectory::prepare(), executed on EL8 (el8_amd64_gcc12):
Thread 2 (Thread 0x145ebacb4640 (LWP 224) "cmsRun"):
#0 0x0000145ee311830f in wait4 () from /lib64/libc.so.6
#1 0x0000145ede9f6e17 in edm::service::cmssw_stacktrace_fork() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#2 0x0000145ede9f9cea in edm::service::InitRootHandlers::stacktraceHelperThread() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3 0x0000145ee34d8a73 in std::execute_native_thread_routine (__p=0x145ed2eaf290) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#4 0x0000145ee309f802 in start_thread () from /lib64/libc.so.6
#5 0x0000145ee303f450 in clone3 () from /lib64/libc.so.6
Thread 1 (Thread 0x145ee44bb640 (LWP 214) "cmsRun"):
#0 0x0000145ee31426ff in poll () from /lib64/libc.so.6
#1 0x0000145edea456af in full_read.constprop () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#2 0x0000145ede9f9dbc in edm::service::InitRootHandlers::stacktraceFromThread() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3 0x0000145ede9fa720 in sig_dostack_then_abort () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#4 <signal handler called>
#5 atomic_load_p (mo=atomic_memory_order_relaxed, a=0x846f8) at include/jemalloc/internal/atomic.h:83
#6 rtree_leaf_elm_bits_read (dependent=true, elm=0x846f8, rtree=<optimized out>, tsdn=0x145ee44b5738) at include/jemalloc/internal/rtree.h:182
#7 rtree_leaf_elm_read (dependent=true, elm=0x846f8, rtree=<optimized out>, tsdn=0x145ee44b5738) at include/jemalloc/internal/rtree.h:242
#8 rtree_metadata_read (rtree=<optimized out>, key=4611594290014846845, rtree_ctx=0x145ee44b58e8, tsdn=0x145ee44b5738) at include/jemalloc/internal/rtree.h:446
#9 emap_alloc_ctx_lookup (alloc_ctx=<synthetic pointer>, ptr=0x3fffac92d08dff7d, emap=<optimized out>, tsdn=0x145ee44b5738) at include/jemalloc/internal/emap.h:238
#10 ifree (slow_path=false, tcache=0x145ee44b5a90, ptr=0x3fffac92d08dff7d, tsd=0x145ee44b5738) at src/jemalloc.c:2877
#11 je_free_default (ptr=0x3fffac92d08dff7d) at src/jemalloc.c:3014
#12 0x0000145e89d48b8a in gbl::GblTrajectory::prepare() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/external/el8_amd64_gcc12/lib/libGBL.so.2.4
#13 0x0000145e89d4b1c0 in gbl::GblTrajectory::construct() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/external/el8_amd64_gcc12/lib/libGBL.so.2.4
#14 0x0000145e834451fe in MillePedeAlignmentAlgorithm::addReferenceTrajectory(edm::EventSetup const&, AlignmentAlgorithmBase::EventInfo const&, ReferenceCountingPointer<ReferenceTrajectoryBase> const&) () from /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/CMSSW_14_0_0/lib/el8_amd64_gcc12/pluginAlignmentMillePedeAlignmentAlgorithmAuto.so
#15 0x0000145e83445850 in MillePedeAlignmentAlgorithm::run(edm::EventSetup const&, AlignmentAlgorithmBase::EventInfo const&) () from /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/CMSSW_14_0_0/lib/el8_amd64_gcc12/pluginAlignmentMillePedeAlignmentAlgorithmAuto.so
#16 0x0000145e8a58947f in AlignmentProducerBase::processEvent(edm::Event const&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libAlignmentCommonAlignmentProducer.so
#17 0x0000145ee543da9e in edm::one::EDProducerBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreFramework.so
#18 0x0000145ee542820e in edm::WorkerT<edm::one::EDProducerBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreFramework.so
#19 0x0000145ee53ba669 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreFramework.so
#20 0x0000145ee53bb73f in edm::SerialTaskQueue::QueuedTask<edm::SerialTaskQueueChain::push<edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute()::{lambda()#1}&>(tbb::detail::d1::task_group&, edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute()::{lambda()#1}&)::{lambda()#1}>::execute() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreFramework.so
#21 0x0000145ee50e51d0 in tbb::detail::d1::function_task<edm::SerialTaskQueue::spawn(edm::SerialTaskQueue::TaskBase&)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreConcurrency.so
#22 0x0000145ee45da281 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=0x145ee19b3e00) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_0_pre2_SKYLAKEAVX512-el8_amd64_gcc12/build/CMSSW_14_0_0_pre2_SKYLAKEAVX512-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-5109e373398eaf79a8268d6a86a2f6e2/tbb-v2021.9.0/src/tbb/task_dispatcher.h:322
#23 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=0x145ee19b3e00) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_0_pre2_SKYLAKEAVX512-el8_amd64_gcc12/build/CMSSW_14_0_0_pre2_SKYLAKEAVX512-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-5109e373398eaf79a8268d6a86a2f6e2/tbb-v2021.9.0/src/tbb/task_dispatcher.h:458
#24 tbb::detail::r1::task_dispatcher::execute_and_wait (t=<optimized out>, wait_ctx=..., w_ctx=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_0_pre2_SKYLAKEAVX512-el8_amd64_gcc12/build/CMSSW_14_0_0_pre2_SKYLAKEAVX512-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-5109e373398eaf79a8268d6a86a2f6e2/tbb-v2021.9.0/src/tbb/task_dispatcher.cpp:168
#25 0x0000145ee533dceb in edm::FinalWaitingTask::wait() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreFramework.so
#26 0x0000145ee534765a in edm::EventProcessor::processRuns() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreFramework.so
#27 0x0000145ee5347bb1 in edm::EventProcessor::runToCompletion() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_0/lib/el8_amd64_gcc12/libFWCoreFramework.so
#28 0x00000000004074ef in tbb::detail::d1::task_arena_function<main::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const ()
#29 0x0000145ee45c69ad in tbb::detail::r1::task_arena_impl::execute (ta=..., d=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_0_pre2_SKYLAKEAVX512-el8_amd64_gcc12/build/CMSSW_14_0_0_pre2_SKYLAKEAVX512-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-5109e373398eaf79a8268d6a86a2f6e2/tbb-v2021.9.0/src/tbb/arena.cpp:688
#30 0x0000000000408ed2 in main::{lambda()#1}::operator()() const ()
#31 0x000000000040517c in main ()
Module: AlignmentProducerAsAnalyzer:AlignmentProducer (crashed)
######### TO REPRODUCE ######### cd /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/CMSSW_14_0_0 cmsenv cd /afs/cern.ch/work/s/sroychow/public/TkDPG/segfaultmille/scripts cmsRun mille_failing.py
About this issue
- Original URL
- State: closed
- Created 4 months ago
- Comments: 46 (45 by maintainers)
@cmsbuild, please close
+alca
+core
@smuzaffar I ran the test now with
CMSSW_14_1_X_2024-03-11-2300and it’s successful. Thanks for the dist updates.https://github.com/cms-sw/cmsdist/pull/9053 contains the mentioned update (not sure will compile fine though, an earlier update was failing checks)
assign core
Could you run the job in
CMSSW_14_1_ASAN_X_2024-02-26-2300and report here what ASAN reports?(I’d also suggest to use the “code block” formatting for the output, i.e. start and end the block with three backuotes
```. I find the regular text formatting difficult to read for this kind of logs)assign alca