ref-napi: crash in v8::ArrayBuffer::GetBackingStore

steps to reprouce:

gireesh@p1:~/mq/rec1$ node -v
v14.15.4

gireesh@p1:~/mq/rec1$ cat rec.js

var ref = require('ref-napi');
var st = require('ref-struct-di')(ref);
var at = require('ref-array-di')(ref);

var rt = st({
  asid: at(ref.types.char,1024),
  os: st({ptr: ref.refType(ref.types.void)})
})

var idx = 0
const r = new rt()
r.os.ptr = Buffer.alloc(10)
setInterval(() => {
  ref.reinterpret(r.os.ptr, 0, 'utf8');
  if (++idx % 1000 === 0) console.log(`${idx}th iteration`)
  r.os.ptr = Buffer.alloc(10)
},1)

gireesh@p1:~/mq/rec1$ node rec

1000th iteration


#
# Fatal error in , line 0
# Check failed: result.second.
#
#
#
#FailureMessage Object: 0x7ffd839cc040
 1: 0xa71261  [node]
 2: 0x19d3254 V8_Fatal(char const*, ...) [node]
 3: 0xe5b549 v8::internal::GlobalBackingStoreRegistry::Register(std::shared_ptr<v8::internal::BackingStore>) [node]
 4: 0xba6a98 v8::ArrayBuffer::GetBackingStore() [node]
 5: 0x9c1be0 napi_get_typedarray_info [node]
 6: 0x7fea6e5ef0ef  [/home/gireesh/mq/rec1/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node]
 7: 0x7fea6e5f15bb  [/home/gireesh/mq/rec1/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node]
 8: 0x7fea6e5f7ceb Napi::details::CallbackData<void (*)(Napi::CallbackInfo const&), void>::Wrapper(napi_env__*, napi_callback_info__*) [/home/gireesh/mq/rec1/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node]
 9: 0x9b8c6f  [node]
10: 0xbe571b  [node]
11: 0xbe6cc6  [node]
12: 0xbe7346 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
13: 0x14012d9  [node]
Illegal instruction (core dumped)
gireesh@p1:~/mq/rec1$ 

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 1
  • Comments: 35 (8 by maintainers)

Commits related to this issue

Most upvoted comments

Fyi, my branches have the full implementation with separate structures. All tests pass, though it might not be 100% compatible:

https://github.com/alphacep/ref-napi https://github.com/alphacep/node-ffi-napi

please let me know if you are interested to merge or we can just have a new packages inside this project.

Hi everyone. Any updates on this? Seems that @nshmyrev proposed a solution. I think that he can make a fork but this will be not as useful for community because everyone ends up fixing same bugs in their own implementation.

@addaleax do you have plans and time to look into this issue?

Ok the same thing but using node-addon-api versus dropping down into node-api c calls:

diff --git a/src/binding.cc b/src/binding.cc
index d1fefbd..7d8906e 100644
--- a/src/binding.cc
+++ b/src/binding.cc
@@ -355,6 +355,16 @@ void WritePointer(const CallbackInfo& args) {
   if (input.IsNull()) {
     *reinterpret_cast<char**>(ptr) = nullptr;
   } else {
+    // create a node-api reference and finalizer to ensure that
+    // the buffer whoes pointer is written can only be
+    // collected after the finalizers for the buffer
+    // to which the pointer was written have already run
+    Reference<Value>* ref = new Reference<Value>;
+    *ref = Persistent(args[2]);
+    args[0].As<Object>().AddFinalizer([](Env env, Reference<Value>* ref) {
+      delete ref;
+    }, ref);
+
     char* input_ptr = GetBufferData(input);
     *reinterpret_cast<char**>(ptr) = input_ptr;
   }

I’m going to open a PR and discussion on whether it is the right fix can continue there.

I’m also having this exact issue.

I haven’t seen a node ffi implementation working well since Node 12.x with the lxe pull request, tonight I thought I can finally switch to a higher NodeJS version by using ffi-napi instead of node-ffi but no luck…hopefully the proposed pull request above is accepted.

Simplified version that crashes quickly:

var ffi = require('ffi-napi');
var ref = require('ref-napi');

const ptr = ref.refType(ref.types.void);

var libc = ffi.Library('libc', {
  'malloc': [ ptr, [ 'int' ] ],
  'free': [ 'void', [ ptr ] ]
});

for (var i = 0; i < 10000; i++) {
    p = libc.malloc(100000);
    console.log(p);
    libc.free(p);
}

After incrementally adding a bunch of instrumentation it looked be related to WritePointer being called as that seemed to be what was always on the stack when the crash occurred. (EDIT I see Gireesh pointed that out above, but I’d started down the instrumentation path before then).

In writePointer

exports.writePointer = function writePointer (buf, offset, ptr) {
  debug('writing pointer to buffer', buf, offset, ptr);
  exports._writePointer(buf, offset, ptr);
  exports._attach(buf, ptr);
};

_attach tries to avoid the buffer to which the pointer refers being freed before the buffer to which the pointer was written. However I seem to remember @addaleax mentioning that it was difficult to ensure that the cleanup for the buffer to which the pointer was written was complete before the reference buffer was freed. Currently it only ensures that both buffers become eligible to be collected and their finalizes run at the same time but does not ensure which order that will happen.

This patch will ensure that the buffer to which the pointer refers will only become eligible to be collected AFTER the finalizers for the buffer to which the pointer was written have been run. It seems to avoid the crash:

diff --git a/src/binding.cc b/src/binding.cc
index d1fefbd..2cbc5f7 100644
--- a/src/binding.cc
+++ b/src/binding.cc
@@ -355,6 +355,17 @@ void WritePointer(const CallbackInfo& args) {
   if (input.IsNull()) {
     *reinterpret_cast<char**>(ptr) = nullptr;
   } else {
+    // create a napi reference and finalizer to ensure that
+    // the buffer whoes pointer is written can only be
+    // collected after the finalizers for the buffer
+    // to which the pointer was written have already run
+    napi_ref* _ref = new napi_ref;
+    napi_create_reference(env, args[2], 1, _ref);
+    args[0].As<Object>().AddFinalizer([](Env env, napi_ref* ref) {
+      napi_delete_reference(env, *ref);
+      delete ref;
+    }, _ref);
+         
     char* input_ptr = GetBufferData(input);
     *reinterpret_cast<char**>(ptr) = input_ptr;
   }

There are other places that use _attach so this might not be the full fix, but I thought I’d post share what I’ve figured out so far.

Thanks! I hear you on the developer priorities thing - funnily enough, I’m after this fix in order to do unpaid maintenance elsewhere 😅

Thank you, Tim. I could take a look on PR a bit later but if you could help and pull the changes it would be much faster.

Since my maintenance of this package is mostly unpaid, and this would unfortunately be a larger rewrite, it’s not at the top of my agenda. @nshmyrev’s proposed solution conceptually works, and if there is a PR at some point, I’d be happy to take a look.

I’m also looking for a fix for this problem.

but major backwards-compatibility breakage isn’t really an option for this package, I’m afraid.

Forgive my ignorance, but is there a reason that breaking changes can’t be made along with a major semver bump?

Well, no other way with the current design. It could be another package like ref-napi-ng if you care about backward compatibility. At least people will be able to use FFI.

I started the implementation here, almost there, few bits with strings left:

https://github.com/alphacep/ref-napi/commit/b2d32a8264aa1010220667dbeca066d505a4de9e

@nshmyrev I’ll get to this when I can. If you have a good understanding of what’s happening, you can also open a PR, which would speed things up a bit.

Hi

we still see the crash with ref-napi 3.0.2

The code we use looks like this, it essentially creates an object and releases it afterwards:

var vosk = require('..')
const model = new vosk.Model("model")

for (var i = 0; i < 1000; i++) {
    console.log("iter", i)
    const rec = new vosk.Recognizer({model: model, sampleRate: 16000});
    console.log("handle", rec.handle)
    rec.free();
    console.log("free")
}

I’ll try to simplify code probably. The crash is not stable it occurs at different iterations, so likely a threading issue. The backtrace looks like this:

Thread 1 "node_g" received signal SIGILL, Illegal instruction.
v8::base::OS::Abort () at ../deps/v8/src/base/platform/platform-posix.cc:502
502	    V8_IMMEDIATE_CRASH();
(gdb) bt
#0  v8::base::OS::Abort () at ../deps/v8/src/base/platform/platform-posix.cc:502
#1  0x00005555581b5fd5 in V8_Fatal (file=file@entry=0x555558bae718 "../deps/v8/src/objects/backing-store.cc", line=line@entry=696, 
    format=format@entry=0x555558b27044 "Check failed: %s.") at ../deps/v8/src/base/logging.cc:167
#2  0x0000555556c11d19 in v8::internal::GlobalBackingStoreRegistry::Register (
    backing_store=std::shared_ptr<v8::internal::BackingStore> (use count 3, weak count 1) = {...})
    at ../deps/v8/src/objects/backing-store.cc:696
#3  0x000055555670eb8e in v8::ArrayBuffer::GetBackingStore (this=<optimized out>) at ../deps/v8/src/api/api.cc:3726
#4  0x00005555563bca51 in napi_get_typedarray_info (env=0x55555b4a2990, typedarray=0x7fffffffcee0, type=0x7fffffffc960, 
    length=0x7fffffffc968, data=0x7fffffffc970, arraybuffer=0x0, byte_offset=0x0) at ../src/js_native_api_v8.cc:2946
#5  0x00007ffff50870ff in (anonymous namespace)::GetBufferData(Napi::Value) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#6  0x00007ffff5089a4d in (anonymous namespace)::InstanceData::GetBufferData(napi_value__*) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#7  0x00007ffff507001d in FFI::FFI::FFICall(Napi::CallbackInfo const&) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ffi-napi/build/Release/ffi_bindings.node
#8  0x00007ffff5072e0d in Napi::details::CallbackData<void (*)(Napi::CallbackInfo const&), void>::Wrapper(napi_env__*, napi_callback_info__*) () from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ffi-napi/build/Release/ffi_bindings.node
#9  0x00005555563b1674 in v8impl::(anonymous namespace)::CallbackWrapperBase::<lambda(napi_env)>::operator()(napi_env) const (
    __closure=0x7fffffffcb70, env=0x55555b443960) at ../src/js_native_api_v8.cc:599
#10 0x00005555563bdf89 in napi_env__::CallIntoModule<v8impl::(anonymous namespace)::CallbackWrapperBase::InvokeCallback()::<lambda(napi_env)> >(v8impl::(anonymous namespace)::CallbackWrapperBase::<lambda(napi_env)> &&, void (&&)(napi_env__ *, v8::Local<v8::Value>)) (
    this=0x55555b443960, call=..., handle_exception=
    @0x5555563beb92: {void (napi_env__ *, v8::Local<v8::Value>)} 0x5555563beb92 <napi_env__::HandleThrow(napi_env__*, v8::Local<v8::Value>)>) at ../src/js_native_api_v8.h:95
#11 0x00005555563b16e5 in v8impl::(anonymous namespace)::CallbackWrapperBase::InvokeCallback (this=0x7fffffffcbb0)
    at ../src/js_native_api_v8.cc:598
#12 0x00005555563b173a in v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke (info=...) at ../src/js_native_api_v8.cc:616
#13 0x000055555677fb3d in v8::internal::FunctionCallbackArguments::Call (this=this@entry=0x7fffffffcd20, handler=..., 
    handler@entry=...) at ../deps/v8/src/api/api-arguments-inl.h:158
#14 0x00005555567808b9 in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> (isolate=isolate@entry=0x55555b3a0860, 
    function=..., function@entry=..., new_target=..., new_target@entry=..., fun_data=..., receiver=..., receiver@entry=..., args=...)
    at ../deps/v8/src/builtins/builtins-api.cc:113
#15 0x000055555678521b in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=isolate@entry=0x55555b3a0860)
    at ../deps/v8/src/handles/handles.h:132
#16 0x0000555556786048 in v8::internal::Builtin_HandleApiCall (args_length=9, args_object=0x7fffffffcee8, isolate=0x55555b3a0860)
    at ../deps/v8/src/builtins/builtins-api.cc:131
#17 0x0000555557732b00 in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit ()
    at ../../deps/v8/src/builtins/torque-internal.tq:84
#18 0x00005555574c2585 in Builtins_InterpreterEntryTrampoline () at ../../deps/v8/src/objects/string.tq:183

The backingstore trace is attached too.

bs.log

Still looking on this issue, but maybe you have some ideas. I don’t think the patch above solved the problem.

@prabhatmishra33 I see this on the stack in the issue you posted 6: 0x7f268814b0ef [/app/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node] so its possible.

Since it’s in production it might be hard to try the change in #55, but that would be the best way to confirm one way or the other.