tiflash: TiFlash crash when preHandleSSTsToDTFiles in CI fullstack test
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
Set TiFlash replica. https://ci.pingcap.net/blue/rest/organizations/jenkins/pipelines/tiflash-ghpr-integration-tests/runs/10529/nodes/182/steps/305/log/?start=0
2. What did you expect to see? (Required)
Don’t crash.
3. What did you see instead (Required)
TiFlash crashed.
[2023-01-10T03:20:36.216Z] [2023/01/10 11:10:23.602 +08:00] [ERROR] [Exception.cpp:89] ["Code: 49, e.displayText() = DB::Exception: RWLock::getLock(): RWLock is already locked in exclusive mode: physical_table_id=-1: (while preHandleSnapshot region_id=2650, index=7, term=6), e.what() = DB::Exception, Stack trace:\n\n\n 0x37f675e\tStackTrace::StackTrace() [tiflash+58681182]\n 0x37e5fb2\tDB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+58613682]\n 0xc3f0480\tDB::RWLock::getLock(DB::RWLock::Type, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&) [tiflash+205456512]\n 0xc3ed1d5\tDB::IStorage::tryLockTimed(std::__1::shared_ptr<DB::RWLock> const&, DB::RWLock::Type, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&) const [tiflash+205443541]\n 0xc3ed7ae\tDB::IStorage::lockStructureForShare(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> > const&) [tiflash+205445038]\n 0xc897ff0\tDB::AtomicGetStorageSchema(std::__1::shared_ptr<DB::Region> const&, DB::TMTContext&)::$_6::operator()(bool) const [tiflash+210337776]\n 0xc897ac8\tDB::AtomicGetStorageSchema(std::__1::shared_ptr<DB::Region> const&, DB::TMTContext&) [tiflash+210336456]\n 0xc7eb592\tDB::KVStore::preHandleSSTsToDTFiles(std::__1::shared_ptr<DB::Region>, DB::SSTViewVec, unsigned long, unsigned long, DB::DM::FileConvertJobType, DB::TMTContext&) [tiflash+209630610]\n 0xc7eb108\tDB::KVStore::preHandleSnapshotToFiles(std::__1::shared_ptr<DB::Region>, DB::SSTViewVec, unsigned long, unsigned long, DB::TMTContext&) [tiflash+209629448]\n 0xc8a6484\tPreHandleSnapshot [tiflash+210396292]\n 0x7f04d4432930\tengine_store_ffi::_$LT$impl$u20$engine_store_ffi..interfaces..root..DB..EngineStoreServerHelper$GT$::pre_handle_snapshot::h13dfe909d7f6a3ea [libtiflash_proxy.so+76658992]\n 0x7f04d444508d\tengine_store_ffi::observer::pre_handle_snapshot_impl::hccb27a62a7e3c70f [libtiflash_proxy.so+76734605]\n 0x7f04d445a063\t_$LT$engine_store_ffi..observer..TiFlashObserver$u20$as$u20$raftstore..coprocessor..ApplySnapshotObserver$GT$::pre_apply_snapshot::_$u7b$$u7b$closure$u7d$$u7d$::h0f781b1b777e36fa [libtiflash_proxy.so+76820579]\n 0x7f04d44624e1\t_$LT$core..future..from_generator..GenFuture$LT$T$GT$$u20$as$u20$core..future..future..Future$GT$::poll::h7b86279f362c63a3 [libtiflash_proxy.so+76854497]\n 0x7f04d4438b79\tyatp::task::future::RawTask$LT$F$GT$::poll::hef947552eb87fdf3 [libtiflash_proxy.so+76684153]\n 0x7f04d7141ede\tyatp::task::future::TaskCell::poll::h7c797921d1a996a2 [libtiflash_proxy.so+123907806]\n 0x7f04d7142c94\t_$LT$yatp..task..future..Runner$u20$as$u20$yatp..pool..runner..Runner$GT$::handle::h4d1272e83351072c [libtiflash_proxy.so+123911316]\n 0x7f04d7141d43\tyatp::pool::worker::WorkerThread$LT$T$C$R$GT$::run::h7506ebb2efa488d0 [libtiflash_proxy.so+123907395]\n 0x7f04d7128ccd\tyatp::pool::builder::LazyBuilder$LT$T$GT$::build::_$u7b$$u7b$closure$u7d$$u7d$::h4ac5170a13db1e09 [libtiflash_proxy.so+123804877]\n 0x7f04d711c64c\tstd::sys_common::backtrace::__rust_begin_short_backtrace::hcf649c42df4fed43 [libtiflash_proxy.so+123754060]\n 0x7f04d714d1ed\tstd::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h64a73749dd71c9f8 [libtiflash_proxy.so+123953645]\n 0x7f04d711a641\t_$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h85b1ff4258c2deaf [libtiflash_proxy.so+123745857]\n 0x7f04d714d3b6\tstd::panicking::try::do_call::he09d687beb31ad5b [libtiflash_proxy.so+123954102]\n 0x7f04d715fadb\t__rust_try [libtiflash_proxy.so+124029659]\n 0x7f04d714d2df\tstd::panicking::try::hec245749eb3c66e0 [libtiflash_proxy.so+123953887]\n 0x7f04d714ae51\tstd::panic::catch_unwind::h6f4be4704c94ef3b [libtiflash_proxy.so+123944529]\n 0x7f04d714cfba\tstd::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::h2c29f8e8e8dfb511 [libtiflash_proxy.so+123953082]\n 0x7f04d714e627\tcore::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h5eb0e81e2f12c3b9 [libtiflash_proxy.so+123958823]\n 0x7f04d84fc873\tstd::sys::unix::thread::Thread::new::thread_start::hd2791a9cabec1fda [libtiflash_proxy.so+144595059]\n \t/rustc/96ddd32c4bfb1d78f0cd03eb068b1710a8cebeef/library/std/src/sys/unix/thread.rs:108\n 0x7f04cf3f8ea5\tstart_thread [libpthread.so.0+32421]\n 0x7f04ced0796d\t__clone [libc.so.6+1042797]"] [source="DB::RawCppPtr DB::PreHandleSnapshot(DB::EngineStoreServerWrap *, DB::BaseBuffView, uint64_t, DB::SSTViewVec, uint64_t, uint64_t)"] [thread_id=50]
The thread_names created by TiFlash Proxy are RaftStoreProxy
.
4. What is your TiFlash version? (Required)
f9167f36641360530425459c2e0e0b4ad4750841 in master branch
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 17 (17 by maintainers)
Commits related to this issue
- fix pthread_setname_np (#6630) (#7382) close pingcap/tiflash#6456, close pingcap/tiflash#6616 — committed to pingcap/tiflash by ti-chi-bot a year ago
Release:
Debug:
The
pthread_setname_np
symbol is in local text in release version because there is-fvisibility=hidden
flag in release version.Note 6.1 proxy + 6.1 tiflash works fine since
6.5 proxy + 6.1 tiflash works not find since