node-addon-api: Segmentation faults - Inconsistent errors with lots of async workers and objects

Hello,

I’m working on a pull request for a project using node-addon-api. When I create a lot of objects via AsyncWorker I get Segmentation faults.

I traced some of them down to create a fresh Napi::Env via info.Env() vs the persistent object’s Env() call. Is there a difference here or a reason why I can’t do that?

I’m now running into other issues. These problems only occur after over 10,000 objects are created and destroyed.

https://github.com/openzim/node-libzim/pull/72/files#diff-f28f98b9b15ae6a552a5b00f2b80c65a701a42b1e75f81aff149ac16e480b238R15

This occurs when I call makeLargeZim.ts which calls creator.addItem. addItem creates an AsyncWorker which stores a promise which is resolved once the libzim call completes. This is done because libzim will wait until other threads within its library are finished in the queue which blocks the nodejs / libuv main event thread.

This works fine most of the time with 10, 100, 1000, 10000, but when I try to do ~1,000,000 I get an error. Its not a memory issue, plenty of ram. Any guidance would be helpful. Thank you.

The backtrace from gdb is here

#0  0x0000555555edbe10 in v8::internal::GlobalHandles::Create(v8::internal::Object) ()                                                                                                                                               [0/14875]
#1  0x0000555555d7aee4 in v8::V8::GlobalizeReference(v8::internal::Isolate*, unsigned long*) ()                                                                                                                                               
#2  0x0000555555b2c171 in v8impl::Reference::New(napi_env__*, v8::Local<v8::Value>, unsigned int, bool, void (*)(napi_env__*, void*, void*), void*, void*) ()                                                                                 
#3  0x0000555555b3356c in napi_wrap ()                                                                                                                                                                                                        
#4  0x00007fffe674e91a in Napi::ObjectWrap<Blob>::ObjectWrap (this=0x555557f48400, callbackInfo=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:4151                                                               
#5  0x00007fffe67341fe in Blob::Blob (this=0x555557f48400, info=...) at ../src/blob.h:17                                                                                                                                                      
#6  0x00007fffe67737ca in Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*)::{lambda()#2}::operator()() const (__closure=0x7fffffff8550)                                                                 
    at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:4519                                                                                                                                                                
#7  0x00007fffe6789e54 in Napi::details::WrapCallback<Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*)::{lambda()#2}>(Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info
__*)::{lambda()#2}) (callback=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:79                                                                                                                                   
#8  0x00007fffe6773917 in Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper (env=0x55555764dfb0, info=0x7fffffff8620) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:4517                                             
#9  0x0000555555b2b9df in ?? ()                                                                                                                                                                                                               
#10 0x0000555555dd7738 in v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) ()                                                                                                                                     
#11 0x0000555555dd7c08 in ?? ()                                                                                                                                                                                                               
#12 0x0000555555dd8611 in v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*
, v8::internal::Handle<v8::internal::HeapObject>) ()                                                                                                                                                                                          
#13 0x0000555555e9e618 in ?? ()                                                                                                                                                                                                               
#14 0x0000555555e9f004 in v8::internal::Execution::New(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ()
#15 0x0000555555d99f49 in v8::Function::NewInstanceWithSideEffectType(v8::Local<v8::Context>, int, v8::Local<v8::Value>*, v8::SideEffectType) const ()
#16 0x0000555555b34e63 in napi_new_instance ()                                                                                                                                                                                                
#17 0x00007fffe673011a in Napi::Function::New (this=0x7fffffff8e00, argc=1, args=0x7fffffff8e90) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2384
#18 0x00007fffe67300c7 in Napi::Function::New (this=0x7fffffff8e00, args=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2373                                                      
#19 0x00007fffe673175e in Napi::FunctionReference::New (this=0x5555579e6de8, args=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:3408
#20 0x00007fffe67346fd in Blob::New (env=..., blob=...) at ../src/blob.h:47                                                                                                                                                                   
#21 0x00007fffe673d2b6 in StringProvider::feed (this=0x555557fa2ac0, info=...) at ../src/contentProvider.h:152
#22 0x00007fffe6724153 in operator() (__closure=0x7fffffff9010) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:170                                                                                                     
#23 0x00007fffe672ae2f in Napi::details::WrapCallback<Napi::details::TemplatedInstanceCallback<StringProvider, &StringProvider::feed>(napi_env, napi_callback_info)::<lambda()> >(struct {...}) (callback=...)
    at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:79                                                                                                                                                                  
#24 0x00007fffe67241e3 in Napi::details::TemplatedInstanceCallback<StringProvider, &StringProvider::feed> (env=0x55555764dfb0, info=0x7fffffff90a0) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:167
#25 0x0000555555b2b9df in ?? ()
#26 0x0000555555dd7738 in v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) ()
#27 0x0000555555dd7e32 in ?? ()
#28 0x0000555555dd8552 in v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*
, v8::internal::Handle<v8::internal::HeapObject>) ()
#29 0x0000555555e9e618 in ?? ()
#30 0x0000555555e9ee12 in v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ()
#31 0x0000555555d9a469 in v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) ()
#32 0x0000555555b31209 in napi_call_function ()
#33 0x00007fffe672ffeb in Napi::Function::Call (this=0x7fffffff98d0, recv=0x555557565a38, argc=0, args=0x0) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2339
#34 0x00007fffe672feee in Napi::Function::Call (this=0x7fffffff98d0, recv=0x555557565a38, args=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2306
#35 0x00007fffe673c9a8 in ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}::operator()(Napi::Env, Napi::Function) const (__closure=0x7fffcc4b83a0, env=...) at ../src/contentProvider.h:74
#36 0x00007fffe67a27de in std::__invoke_impl<void, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env, Napi::Function>(std::__invoke_other, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#
1}&, Napi::Env&&, Napi::Function&&) (__f=...) at /usr/include/c++/11.2.0/bits/invoke.h:61
#37 0x00007fffe678f69c in std::__invoke_r<void, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env, Napi::Function>(ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env&&, Napi::
Function&&) (__fn=...) at /usr/include/c++/11.2.0/bits/invoke.h:111
#38 0x00007fffe67783e3 in std::_Function_handler<void (Napi::Env, Napi::Function), ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}>::_M_invoke(std::_Any_data const&, Napi::Env&&, Napi::Function&&) (__functor=..., 
    __args#0=..., __args#1=...) at /usr/include/c++/11.2.0/bits/std_function.h:291
#39 0x00007fffe674cf89 in std::function<void (Napi::Env, Napi::Function)>::operator()(Napi::Env, Napi::Function) const (this=0x7fffcc4b83a0, __args#0=..., __args#1=...) at /usr/include/c++/11.2.0/bits/std_function.h:560
#40 0x00007fffe6732c30 in Napi::ThreadSafeFunction::CallJS (env=0x55555764dfb0, jsCallback=0x555557565a28, data=0x7fffcc4b83a0) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:5780
#41 0x0000555555b4adae in ?? ()
#42 0x00007ffff7f3d92d in ?? () from /usr/lib/libuv.so.1
#43 0x00007ffff7f59d0e in ?? () from /usr/lib/libuv.so.1
#44 0x00007ffff7f43438 in uv_run () from /usr/lib/libuv.so.1
--Type <RET> for more, q to quit, c to continue without paging--

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 44 (34 by maintainers)

Most upvoted comments

@KevinEady Thank you. Any hint for you in this Valgrind report?

We discussed in the 10 June Node API meeting to try to make a standalone reproduction of this problem that would be a little easier to debug / troubleshoot. If we mimic the several-thousand creation of AsyncWorkers in our standalone test and we do not see the crash, we can (hopefully) assume there’s something incorrect in your code that’s causing this crash.

I will continue working on this standalone reproduction in order to try to get a similar crash or rule out core issues.

AFAIK after talking to @kelvinhammond, the bug/problem is still there.

Hi @kelvinhammond , @kelson42 ,

I’m not too sure with the internal workings of the addon or libzim, but looking at the stacktrace it seems like a use-after-free error. The addon’s Blob is freed through the garbage collector, running its destructor, decreasing the refcount of the std::shared_ptr<zim::Blob> blob_. It also appears that the shared_ptr’s zim::Blob contents are being directly returned here. Since this doesn’t make a copy of the shared pointer, is it possible that the zim::Blob ends up freeing before libzim has had a chance to act on it? Just a guess here.

I’m running valgrind again with a higher --num-callers=50 argument to try to get more information in the use-after-free stack. I’m getting some different – yet similar – valgrind warnings:

==2642== Invalid read of size 1
==2642==    at 0x50A5A10: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x5BD972D: std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5BCB345: std::ostream::write(char const*, long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x16598754: zim::writer::DefaultIndexData::initialize() const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165995B8: zim::writer::DefaultIndexData::hasIndexData() const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165BF840: zim::writer::IndexTask::run(zim::writer::CreatorData*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659C2D8: zim::writer::taskRunner(void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x5B6B2B2: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5D4DB42: start_thread (pthread_create.c:442)
==2642==    by 0x5DDEBB3: clone (clone.S:100)
==2642==  Address 0x15a589a0 is 16 bytes inside a block of size 31 free'd
==2642==    at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x16316A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1631785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x165ABB8: uv__run_check (loop-watcher.c:67)
==2642==    by 0x16532EF: uv_run (core.c:420)
==2642==    by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x5CE2D8F: (below main) (libc_start_call_main.h:58)
==2642==  Block was alloc'd at
==2642==    at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x1631028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1635371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x16353B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
....
==2642== Invalid read of size 1
==2642==    at 0x50A5A1D: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x165DD5CE: ZSTD_compressStream2 (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165DD96A: ZSTD_compressStream (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16587B45: ZSTD_INFO::stream_run_encode(ZSTD_INFO::stream_t*, CompStep) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659AD63: std::_Function_handler<void (zim::Blob const&), zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}>::_M_invoke(std::_Any_data const&, zim::Blob const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16599C80: zim::writer::Cluster::write_data(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A05D: zim::writer::Cluster::write_content(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659B06B: void zim::writer::Cluster::_compress<ZSTD_INFO>() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A39E: zim::writer::Cluster::close() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659C2D8: zim::writer::taskRunner(void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x5B6B2B2: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5D4DB42: start_thread (pthread_create.c:442)
==2642==    by 0x5DDEBB3: clone (clone.S:100)
==2642==  Address 0x1f0372b2 is 2 bytes inside a block of size 31 free'd
==2642==    at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x16316A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1631785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x165ABB8: uv__run_check (loop-watcher.c:67)
==2642==    by 0x16532EF: uv_run (core.c:420)
==2642==    by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x5CE2D8F: (below main) (libc_start_call_main.h:58)
==2642==  Block was alloc'd at
==2642==    at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x1631028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1635371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x16353B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
...
==2642== Invalid read of size 1
==2642==    at 0x50A5A1D: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x165DD5CE: ZSTD_compressStream2 (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165DD96A: ZSTD_compressStream (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16587B45: ZSTD_INFO::stream_run_encode(ZSTD_INFO::stream_t*, CompStep) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659AD63: std::_Function_handler<void (zim::Blob const&), zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}>::_M_invoke(std::_Any_data const&, zim::Blob const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16599C80: zim::writer::Cluster::write_data(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A05D: zim::writer::Cluster::write_content(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659B06B: void zim::writer::Cluster::_compress<ZSTD_INFO>() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A39E: zim::writer::Cluster::close() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659C2D8: zim::writer::taskRunner(void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x5B6B2B2: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5D4DB42: start_thread (pthread_create.c:442)
==2642==    by 0x5DDEBB3: clone (clone.S:100)
==2642==  Address 0x1f0372b2 is 2 bytes inside a block of size 31 free'd
==2642==    at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x16316A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1631785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x165ABB8: uv__run_check (loop-watcher.c:67)
==2642==    by 0x16532EF: uv_run (core.c:420)
==2642==    by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x5CE2D8F: (below main) (libc_start_call_main.h:58)
==2642==  Block was alloc'd at
==2642==    at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x1631028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1635371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x16353B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
...

So it seems like there is some invalid thread safety regarding the Blob on the Node thread and the task runner on the thread created by libzim.

Does this provide any information / clues?

EDIT: Here’s a link to the latest action run: https://github.com/KevinEady/node-libzim/actions/runs/4128364660/jobs/7132716816

So the latest update I have… Running with valgrind (action) displays some memory read errors but does complete successfully, whereas the non-valgrind (action) run gives the segfault.

The valgrind warnings are of this fashion:

Invalid read of size 2
   at 0x50A59E0: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x151850A3: ZSTD_compressStream2 (in /usr/lib/x86_64-linux-gnu/libzstd.so.1.4.8)
   by 0x1518537E: ZSTD_compressStream (in /usr/lib/x86_64-linux-gnu/libzstd.so.1.4.8)
   by 0x150DF524: ZSTD_INFO::stream_run_encode(ZSTD_INFO::stream_t*, CompStep) (compression.cpp:118)
   by 0x151025CB: zim::Compressor<ZSTD_INFO>::feed(char const*, unsigned long, CompStep) (compression.h:243)
   by 0x15101D4F: zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}::operator()(zim::Blob const&) const (cluster.cpp:152)
   by 0x151036AA: void std::__invoke_impl<void, zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&>(std::__invoke_other, zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&) (invoke.h:61)
   by 0x151032F0: std::enable_if<std::__and_<std::is_void<void>, std::__is_invocable<zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&> >::value, void>::type std::__invoke_r<void, zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&>(zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&) (invoke.h:154)
   by 0x15102C58: std::_Function_handler<void (zim::Blob const&), zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}>::_M_invoke(std::_Any_data const&, zim::Blob const&) (std_function.h:290)
   by 0x151020F2: std::function<void (zim::Blob const&)>::operator()(zim::Blob const&) const (std_function.h:590)
   by 0x15101069: zim::writer::Cluster::write_data(std::function<void (zim::Blob const&)>) const (cluster.cpp:244)
   by 0x151007BD: zim::writer::Cluster::write_content(std::function<void (zim::Blob const&)>) const (cluster.cpp:124)
 Address 0x13db0760 is 0 bytes inside a block of size 31 free'd
   at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x14F16A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0x14F1785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x165ABB8: uv__run_check (loop-watcher.c:67)
   by 0x16532EF: uv_run (core.c:420)
   by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
 Block was alloc'd at
   at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x14F1028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0x14F5371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0x14F53B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0xB10D7C: v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xDB11AF: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<true>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xDB17AE: v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x16EF578: Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x1670FEB: Builtins_JSBuiltinsConstructStub (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xBADABE2: ???
   by 0x1671ADB: Builtins_JSEntryTrampoline (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)

Not sure if these memory warnings could cause the behavior witnessed with the segfaulting

@KevinEady Yes, the code that triggers it is commented out in the test-mem-leak script. This is still an issue.

I see that openzim/node-libzim#72 was merged into master. Have you been able to identify the underlying cause, or just accepting its current implementation?

Sorry, forgot to answer this question. We merged but we are exactly at the same stage and have open a dedicated ticket at https://github.com/openzim/node-libzim/issues/80

I have been pulling my hair out trying to understand what’s going on. In a previous meeting, there was a suggestion that maybe there is some buffer overflow in the stack / heap, but I’m really not sure how to verify this.

I just wanted to see what would happen in a later version of node, and using 18.4.0 I get different stack trace errors, eg:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x60000028a3)
  * frame #0: 0x00000001123257b3 zim_binding.node`StringProvider::feed(this=0x0000600004eb36c1, info=0x00007ff7bfef6258) at contentProvider.h:151:30
    frame #1: 0x0000000112325c14 zim_binding.node`napi_value__* Napi::details::TemplatedInstanceCallback<StringProvider, &(StringProvider::feed(Napi::CallbackInfo const&))>(this=0x00007ff7bfef6328)::'lambda'()::operator()() const at napi-inl.h:170:12
    frame #2: 0x00000001123259d9 zim_binding.node`napi_value__* Napi::details::WrapCallback<napi_value__* Napi::details::TemplatedInstanceCallback<StringProvider, &(StringProvider::feed(Napi::CallbackInfo const&))>(napi_env__*, napi_callback_info__*)::'lambda'()>(callback=(unnamed class) @ 0x00007ff7bfef6328))>(napi_env__*, napi_callback_info__*)::'lambda'()) at napi-inl.h:79:12
    frame #3: 0x000000011232599d zim_binding.node`napi_value__* Napi::details::TemplatedInstanceCallback<StringProvider, &(StringProvider::feed(Napi::CallbackInfo const&))>(env=0x00006000033141e0, info=0x00007ff7bfef6398) at napi-inl.h:167:10
    frame #4: 0x0000000100090070 node`v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) + 128
    frame #5: 0x00000001002c57e8 node`v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) + 216
    frame #6: 0x00000001002c52bb node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) + 555
    frame #7: 0x00000001002c4cfe node`v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, v8::internal::Handle<v8::internal::HeapObject>) + 718
    frame #8: 0x00000001003926eb node`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) + 2315
    frame #9: 0x0000000100391dc5 node`v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 213
    frame #10: 0x00000001002752cf node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 511
    frame #11: 0x000000010008afbb node`napi_call_function + 187
    frame #12: 0x00000001122dc4ec zim_binding.node`Napi::Function::Call(this=0x00007ff7bfef6b70, recv=0x0000000106029038, argc=0, args=0x0000000000000000) const at napi-inl.h:2339:24
    frame #13: 0x00000001122dc22e zim_binding.node`Napi::Function::Call(this=0x00007ff7bfef6b70, recv=0x0000000106029038, args=0x00007ff7bfef6b30) const at napi-inl.h:2306:10
    frame #14: 0x0000000112336b5f zim_binding.node`ContentProviderWrapper::feed(this=0x0000600004eb33c8, env=(_env = 0x00006000033141e0), (null)=Function @ 0x00007ff7bfef6b88)::'lambda'(Napi::Env, Napi::Function)::operator()(Napi::Env, Napi::Function) const at contentProvider.h:74:31
    frame #15: 0x0000000112336aa7 zim_binding.node`decltype(__f=0x0000600004eb33c8, __args=0x00007ff7bfef6d18, __args=0x00007ff7bfef6d08)::'lambda'(Napi::Env, Napi::Function)&>(fp)(static_cast<Napi::Env>(fp0), static_cast<Napi::Function>(fp0))) std::__1::__invoke<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) at type_traits:3918:1
    frame #16: 0x0000000112336a37 zim_binding.node`void std::__1::__invoke_void_return_wrapper<void, true>::__call<ContentProviderWrapper::feed(__args=0x0000600004eb33c8, __args=0x00007ff7bfef6d18, __args=0x00007ff7bfef6d08)::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) at invoke.h:61:9
    frame #17: 0x00000001123369e7 zim_binding.node`std::__1::__function::__alloc_func<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator(this=0x0000600004eb33c8, __arg=0x00007ff7bfef6d18, __arg=0x00007ff7bfef6d08)(Napi::Env&&, Napi::Function&&) at function.h:178:16
    frame #18: 0x0000000112335736 zim_binding.node`std::__1::__function::__func<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator(this=0x0000600004eb33c0, __arg=0x00007ff7bfef6d18, __arg=0x00007ff7bfef6d08)(Napi::Env&&, Napi::Function&&) at function.h:352:12
    frame #19: 0x000000011233387f zim_binding.node`std::__1::__function::__value_func<void (Napi::Env, Napi::Function)>::operator(this=0x0000600004eb33c0, __args=0x00007ff7bfef6d18, __args=0x00007ff7bfef6d08)(Napi::Env&&, Napi::Function&&) const at function.h:505:16
    frame #20: 0x0000000112333786 zim_binding.node`std::__1::function<void (Napi::Env, Napi::Function)>::operator(this= Lambda in File contentProvider.h at Line 72, __arg=(_env = 0x00006000033141e0), __arg=Function @ 0x00007ff7bfef6d08)(Napi::Env, Napi::Function) const at function.h:1182:12
    frame #21: 0x00000001123336c4 zim_binding.node`Napi::ThreadSafeFunction::CallJS(env=0x00006000033141e0, jsCallback=0x0000000106029028, (null)=0x0000000000000000, data=0x0000600004eb33c0) at napi-inl.h:5780:5
    frame #22: 0x00000001000a32e8 node`v8impl::(anonymous namespace)::ThreadSafeFunction::AsyncCb(uv_async_s*) + 536
    frame #23: 0x0000000100b8a4cb node`uv__async_io + 379
    frame #24: 0x0000000100b9e2bb node`uv__io_poll + 2075
    frame #25: 0x0000000100b8aa51 node`uv_run + 433
    frame #26: 0x0000000100002f9f node`node::SpinEventLoop(node::Environment*) + 287
    frame #27: 0x00000001001156f1 node`node::NodeMainInstance::Run(int*, node::Environment*) + 97
    frame #28: 0x0000000100115363 node`node::NodeMainInstance::Run() + 115
    frame #29: 0x000000010009e681 node`node::Start(int, char**) + 401
    frame #30: 0x000000010529e51e dyld`start + 462
(lldb) fr s 0
frame #0: 0x00000001123257b3 zim_binding.node`StringProvider::feed(this=0x0000600004eb36c1, info=0x00007ff7bfef6258) at contentProvider.h:151:30
   148 	  Napi::Value feed(const Napi::CallbackInfo &info) {
   149 	    try {
   150 	      // TODO(kelvinhammond): need a way to move this to avoid copying
-> 151 	      auto blob = provider_->feed();
   152 	      return Blob::New(info.Env(), blob);
   153 	    } catch (const std::exception &err) {
   154 	      throw Napi::Error::New(info.Env(), err.what());

Unfortunately I had to wipe my machine and lost my debug build of node 16.15.0 … I am going to build debug version of 18.14.0 and see if there are any significant changes