bluebird: Segmentation Fault with BlueBird 3.6/3.7 and Node 10
TL;DR – We’re seeing an intermittent segmentation fault with BlueBird in one of our tap based test suites, and have a reproduction over here.
- What version of bluebird is the issue happening on?
3.6 and 3.7
- What platform and version? (For example Node.js 0.12 or Google Chrome 32)
NodeJS version 10 on MacOS 10 – it does not appear to happen on Node 12. Other plaforms not tested.
- Did this issue happen with earlier version of bluebird?
It does not happen with BlueBird 3.5.
Details
One of the tests in our test suite has been failing randomly with a segmentation fault recently. We found some time to track it down, and it seems like it started happening with BlueBird 3.6 (maybe the async_hooks stuff?). The segmentation fault doesn’t happen on every run – if you checkout this repository we’ve setup a small “run until fail” reproduction case. The code in the repro is significantly stripped down form the code in our actual test suite.
It’s very “spooky action at a distance” style bug – sometimes the test runs fine, sometimes it fails with something like the following
TAP version 13
# Subtest: bluebird
# Subtest: new Promise() throw
1..0
ok 1 - new Promise() throw # time=4.55ms
./run-until-fail.sh: line 3: 36728 Segmentation fault: 11 $@
We also captured the crash in lldb with a debug build of node 10.
(lldb) r methods.tap.js
Process 36969 launched: '/Users/astorm/bin/node-debug' (x86_64)
TAP version 13
# Subtest: bluebird
# Subtest: new Promise() throw
1..0
ok 1 - new Promise() throw # time=63.303ms
Process 36969 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x0000000101cb25de node-debug`heap_insert(heap=0x00000001035094d8, newnode=0x0000000105c11498, less_than=(node-debug`timer_less_than at timer.c:39)) at heap-inl.h:140
137
138 /* Insert the new node. */
139 newnode->parent = *parent;
-> 140 *child = newnode;
141 heap->nelts += 1;
142
143 /* Walk up the tree and check at each node if the heap property holds.
Target 0: (node-debug) stopped.
Full backtrace from same follows
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000101cb25de node-debug`heap_insert(heap=0x00000001035094d8, newnode=0x0000000105c0ba48, less_than=(node-debug`timer_less_than at timer.c:39)) at heap-inl.h:140
frame #1: 0x0000000101cb23b4 node-debug`uv_timer_start(handle=0x0000000105c0b9e0, cb=(node-debug`node::(anonymous namespace)::TimerWrap::OnTimeout(uv_timer_s*) at timer_wrap.cc:134), timeout=1, repeat=0) at timer.c:93
frame #2: 0x00000001002f2af1 node-debug`node::(anonymous namespace)::TimerWrap::Start(args=0x00007ffeefbfc670) at timer_wrap.cc:120
frame #3: 0x00000001007f4e21 node-debug`v8::internal::FunctionCallbackArguments::Call(this=0x00007ffeefbfc7e0, handler=0x00003c3547afc889) at api-arguments-inl.h:94
frame #4: 0x00000001007f2aae node-debug`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(isolate=0x0000000106004c00, function=Handle<v8::internal::HeapObject> @ 0x00007ffeefbfc8f0, new_target=Handle<v8::internal::HeapObject> @ 0x00007ffeefbfc8e8, fun_data=Handle<v8::internal::FunctionTemplateInfo> @ 0x00007ffeefbfc8e0, receiver=Handle<v8::internal::Object> @ 0x00007ffeefbfc8d8, args=BuiltinArguments @ 0x00007ffeefbfc910) at builtins-api.cc:109
frame #5: 0x00000001007f111a node-debug`v8::internal::Builtin_Impl_HandleApiCall(args=BuiltinArguments @ 0x00007ffeefbfca40, isolate=0x0000000106004c00) at builtins-api.cc:139
frame #6: 0x00000001007f0c4d node-debug`v8::internal::Builtin_HandleApiCall(args_length=6, args_object=0x00007ffeefbfcb28, isolate=0x0000000106004c00) at builtins-api.cc:127
frame #7: 0x000022036e6c1d64
frame #8: 0x000022036e5a0ab6
frame #9: 0x000022036e59336f
frame #10: 0x000022036e719b42
frame #11: 0x000022036e5a0ab6
frame #12: 0x000022036e58f303
frame #13: 0x000022036e5a0ab6
frame #14: 0x000022036e5a0ab6
frame #15: 0x000022036e58f303
frame #16: 0x000022036e5a0ab6
frame #17: 0x000022036e5a0ab6
frame #18: 0x000022036e5a0ab6
frame #19: 0x000022036e5a0ab6
frame #20: 0x000022036e5a0ab6
frame #21: 0x000022036e5a0ab6
frame #22: 0x000022036e58f303
frame #23: 0x000022036e5a0ab6
frame #24: 0x000022036e5a0ab6
frame #25: 0x000022036e5a0ab6
frame #26: 0x000022036e58f303
frame #27: 0x000022036e5a0ab6
frame #28: 0x000022036e5a0ab6
frame #29: 0x000022036e5a0ab6
frame #30: 0x000022036e58f303
frame #31: 0x000022036e5a0ab6
frame #32: 0x000022036e5a0ab6
frame #33: 0x000022036e5a0ab6
frame #34: 0x000022036e5a0ab6
frame #35: 0x000022036e5a0ab6
frame #36: 0x000022036e58f303
frame #37: 0x000022036e5a0ab6
frame #38: 0x000022036e5a0ab6
frame #39: 0x000022036e5a0ab6
frame #40: 0x000022036e58f303
frame #41: 0x000022036e5a0ab6
frame #42: 0x000022036e5a0ab6
frame #43: 0x000022036e5a0ab6
frame #44: 0x000022036e5a0ab6
frame #45: 0x000022036e5a0ab6
frame #46: 0x000022036e58f303
frame #47: 0x000022036e5a0ab6
frame #48: 0x000022036e5a0ab6
frame #49: 0x000022036e597015
frame #50: 0x000022036e58cfa1
frame #51: 0x00000001010f11bd node-debug`v8::internal::GeneratedCode<v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, int, v8::internal::Object***>::Call(this=0x00007ffeefbfdf88, args=0x00003c35e57826f1, args=0x00003c35bea07191, args=0x00003c353ee9f7a1, args=1, args=0x00007ffeefbfe740) at simulator.h:113
frame #52: 0x00000001010eebe6 node-debug`v8::internal::(anonymous namespace)::Invoke(isolate=0x0000000106004c00, is_construct=false, target=Handle<v8::internal::Object> @ 0x00007ffeefbfe0a0, receiver=Handle<v8::internal::Object> @ 0x00007ffeefbfe098, argc=1, args=0x00007ffeefbfe740, new_target=Handle<v8::internal::Object> @ 0x00007ffeefbfe090, message_handling=kReport, execution_target=kCallable) at execution.cc:155
frame #53: 0x00000001010ee439 node-debug`v8::internal::(anonymous namespace)::CallInternal(isolate=0x0000000106004c00, callable=Handle<v8::internal::Object> @ 0x00007ffeefbfe170, receiver=Handle<v8::internal::Object> @ 0x00007ffeefbfe168, argc=1, argv=0x00007ffeefbfe740, message_handling=kReport, target=kCallable) at execution.cc:191
frame #54: 0x00000001010ee2cd node-debug`v8::internal::Execution::Call(isolate=0x0000000106004c00, callable=Handle<v8::internal::Object> @ 0x00007ffeefbfe1d0, receiver=Handle<v8::internal::Object> @ 0x00007ffeefbfe1c8, argc=1, argv=0x00007ffeefbfe740) at execution.cc:202
frame #55: 0x0000000100683828 node-debug`v8::Function::Call(this=0x0000000107050640, context=(val_ = 0x000000010704f700), recv=(val_ = 0x0000000106021c40), argc=1, argv=0x00007ffeefbfe740) at api.cc:5206
frame #56: 0x00000001000cbd0a node-debug`node::InternalMakeCallback(env=0x00007ffeefbfe9f8, recv=(val_ = 0x0000000106021c40), callback=(val_ = 0x0000000107050640), argc=1, argv=0x00007ffeefbfe740, asyncContext=(async_id = 35, trigger_async_id = 31)) at node.cc:672
frame #57: 0x00000001000158b9 node-debug`node::AsyncWrap::MakeCallback(this=0x0000000105f020f0, cb=(val_ = 0x0000000107050640), argc=1, argv=0x00007ffeefbfe740) at async_wrap.cc:683
frame #58: 0x00000001002f320c node-debug`node::(anonymous namespace)::TimerWrap::OnTimeout(handle=0x0000000105f02140) at timer_wrap.cc:143
frame #59: 0x0000000101cb2c4f node-debug`uv__run_timers(loop=0x0000000103509220) at timer.c:174
frame #60: 0x0000000101cba014 node-debug`uv_run(loop=0x0000000103509220, mode=UV_RUN_DEFAULT) at core.c:355
frame #61: 0x00000001000e1bbe node-debug`node::Start(isolate=0x0000000106004c00, isolate_data=0x000000010601e200, args=size=2, exec_args=size=0) at node.cc:2895
frame #62: 0x00000001000dcb59 node-debug`node::Start(event_loop=0x0000000103509220, args=size=2, exec_args=size=0) at node.cc:2994
frame #63: 0x00000001000dbf4f node-debug`node::Start(argc=2, argv=0x0000000105f00000) at node.cc:3053
frame #64: 0x0000000101f9015e node-debug`main(argc=2, argv=0x00007ffeefbff888) at node_main.cc:124
frame #65: 0x0000000100001034 node-debug`start + 52
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 1
- Comments: 34
@astormnewrelic As far as I know, the core dump doesn’t contain symbols, which are necessary to figure out what code actually corresponds to which functions in the original binary.
I might be wrong, but for me, running
bt
insidelldb -c core.30607
gives only the raw function pointers, whereas the backtrace forlldb build/node/out/Debug/node -c core.30607
gives the actual corresponding function names (and line numbers) 🙂Just to be clear, the OpenSSL issue does really only cause nothing besides valgrind warnings. It’s unrelated.
Right, this still looks like a bug in core and I am still not sure why it happens. Why would OpenSSL even be related to
timer.refresh()
causing a segfault?@AndreMaz thanks for the update and for the thorough debugging! It is very appreciated.
@benjamingr after removing the
timer.refresh();
I got zero segfaults 😄 Everything seems to be working properly@addaleax you might be right regarding this being unrelated to async_hooks
The changes also include https://github.com/petkaantonov/bluebird/commit/60ef7a0e23fd320a11281f67c64a39ff95612ce9#diff-d1f5ad3087cc94d3dc6651e3265219b7R59 which calls
timer.refresh()
@AndreMaz can you see if removing the
timer.refresh()
call there fixes the issue?Hey @addaleax my Linux machine is at my work so I’ll do it tomorrow
Just more talking out loud to myself – people on holiday should not respond 😉
I’m going to give the following script a try with
git bisect run
betweenv10.2
andv10.1
. I’ve got thingsscreen
ed off in a medium EC2 instance where builds are taking around an hour, and we’re in roughly 7 steps territory between bisects.We’ll see what falls over in the (PST/Portland) morning 😃
Thanks @benjamingr – I’ll see what I can do w/r/t to a bisect – although I’ve never done it on the node source tree before with a long C++ build in there, so we’ll see how things go 😃. If anyone reading has known science for setting something like that up links/info would be appreciated.
This looks like a Node.js core bug in v10 with async_hooks most likely. Thanks a lot for the repro!