luv: Frequent segfaults in `push_fs_result`

I’m using the nightly neovim build and am regularly encountering segfaults. I haven’t yet 100% narrowed down when they occur, but it mostly seems to be when files are changed while the editor is opened.

This is what the stack trace looks like:

                #0  0x00007f7c9db4e939 push_fs_result (libluv.so.1 + 0xf939)
                #1  0x00007f7c9db542b3 luv_fs_cb (libluv.so.1 + 0x152b3)
                #2  0x00007f7c9d9333fd uv__work_done (libuv.so.1 + 0xc3fd)
                #3  0x00007f7c9d9370cd uv__async_io.part.0 (libuv.so.1 + 0x100cd)
                #4  0x00007f7c9d94ae6c uv__io_poll (libuv.so.1 + 0x23e6c)
                #5  0x00007f7c9d937a14 uv_run (libuv.so.1 + 0x10a14)
                #6  0x0000000000538948 loop_uv_run (nvim + 0x138948)
                #7  0x0000000000643cad inbuf_poll.lto_priv.0 (nvim + 0x243cad)
                #8  0x0000000000643eed os_inchar (nvim + 0x243eed)
                #9  0x00000000006cdaad state_enter (nvim + 0x2cdaad)
                #10 0x0000000000609214 normal_enter (nvim + 0x209214)
                #11 0x00000000004552d0 main (nvim + 0x552d0)
                #12 0x00007f7c9d73d24e __libc_start_call_main (libc.so.6 + 0x2924e)
                #13 0x00007f7c9d73d309 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x29309)
                #14 0x0000000000457585 _start (nvim + 0x57585)

I’m not sure what else to include here, so please tell me if there’s any additional information you require.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 1
  • Comments: 16 (10 by maintainers)

Commits related to this issue

Most upvoted comments

Nevermind, the stack trace is the same as the neovim one if I run it with LuaJIT (I was using PUC Lua since sometimes that makes things easier to debug):

Thread 1 "luajit" received signal SIGSEGV, Segmentation fault.
luv_push_dirent (L=L@entry=0x7ffff7fa9380, ent=0x0, table=table@entry=1) at /home/ryan/Programming/luvit/luv/src/fs.c:121
121	  lua_pushstring(L, ent->name);

#0  luv_push_dirent (L=L@entry=0x7ffff7fa9380, ent=0x0, table=table@entry=1) at /home/ryan/Programming/luvit/luv/src/fs.c:121
#1  0x00007ffff7bfb1d8 in push_fs_result (L=L@entry=0x7ffff7fa9380, req=req@entry=0x7ffff7fc84d8) at /home/ryan/Programming/luvit/luv/src/fs.c:371
#2  0x00007ffff7bfb5b1 in luv_fs_cb (req=0x7ffff7fc84d8) at /home/ryan/Programming/luvit/luv/src/fs.c:401
#3  0x00007ffff7c10240 in uv__work_done (handle=0x7ffff7fbc1f0) at /home/ryan/Programming/luvit/luv/deps/libuv/src/threadpool.c:329
#4  0x00007ffff7c1407b in uv__async_io (loop=0x7ffff7fbc140, w=0x7fffffff9580, events=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/libuv/src/unix/async.c:176
#5  0x00007ffff7c25ff3 in uv__io_poll (loop=loop@entry=0x7ffff7fbc140, timeout=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/libuv/src/unix/linux.c:1303
#6  0x00007ffff7c14cc3 in uv_run (loop=0x7ffff7fbc140, mode=mode@entry=UV_RUN_DEFAULT) at /home/ryan/Programming/luvit/luv/deps/libuv/src/unix/core.c:447
#7  0x00007ffff7c0bc00 in luv_run (L=0x7ffff7fa9380) at /home/ryan/Programming/luvit/luv/src/loop.c:36
#8  0x00005555555ca03b in lj_BC_FUNCC () at buildvm_x86.dasc:859
#9  0x00005555555bbe03 in lua_pcall (L=0x7ffff7fa9380, nargs=<optimized out>, nresults=-1, errfunc=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/luajit/src/lj_api.c:1116
#10 0x000055555555c8ab in docall (L=0x7ffff7fa9380, narg=0, clear=0) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:122
#11 0x000055555555dbd2 in handle_script (argx=<optimized out>, L=0x7ffff7fa9380) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:292
#12 pmain (L=0x7ffff7fa9380) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:550
#13 0x00005555555ca03b in lj_BC_FUNCC () at buildvm_x86.dasc:859
#14 0x00005555555bbfa1 in lua_cpcall (L=<optimized out>, func=<optimized out>, ud=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/luajit/src/lj_api.c:1173
#15 0x000055555555c70e in main (argc=2, argv=0x7fffffffda48) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:581

Reproduced

  test("fs.{open,read,close}dir ref check", function(print, p, expect, uv)
    local dir = assert(uv.fs_opendir('.', nil, 50))

    local function readdir_cb(err, dirs)
      assert(not err)
      if dirs then
        p(dirs)
        uv.fs_readdir(dir, readdir_cb)
      else
        assert(uv.fs_closedir(dir)==true)
      end
    end

    uv.fs_readdir(dir, readdir_cb)
    dir = nil
    collectgarbage()
    collectgarbage()
    collectgarbage()

  end, "1.28.0")

Let’s do some analyze.

  1. In uv.fs_opendir result callback, by newuserdata to create luv_dir, by newuserdata to create luv_dir->handle->dirents and set luv_dir->dirents_ref to dirents.
  2. luv_dir->dirents_ref be unref in uv.fs_closedir or luv_fs_dir_gc, cause dirents gc to invalid.
  3. After call fs_readdir, luv_dir mybe gc before fs_readdir be called.
  4. So we should ref luv_dir in fs_readdir, and unref in readdir callback, avoid lost dirents memory.

Yeah, libuv didn’t have separateDebugInfo, but I managed to enable it myself by overriding libluv in the rust flake and settings the cmake build type as well as dontStrip (that last one took a bit to figure out…), and I have libluv with debug symbols now. Turns out my last crash was so long ago that coredumpctl already cleaned out the stack traces though, so I’ll have to wait for the next crash to get you that line number 😛

enabling debug symbol can differ between projects, separateDebugInfo might be one of those case, if you can point me at instructions to enable debug symbols in libuv, we can see how to modify the nix expression together.

Thanks a lot for the offer still! I learned a lot about overriding things in nix, and I can at least do it for separate targets now. My current way would be building libluv by itself with debug symbols, stripping them out using objcopy and then loading them dynamically in coredumpctl with gdb. Writing an overlay to modify the libluv that neovim builds with would probably be a lot easier, but I haven’t done a deep dive into how overlays work yet.

Mhm, no luck so far I’m afraid. I’ve tried compiling the debug symbols separately and loading them into gdb, but the nixpkgs version seems to be different since I’m getting bogus line numbers. I’m not too experienced with overriding nixpkgs either, I’ll try to get some help on the forums for that. Man, nix is amazing when it works, but it makes things like these so complicated…

Thanks for your patience!

I’ll try that tomorrow and report back. Thanks for the link!

I’m 99% sure it happens when I have neo-tree enabled, but I can’t say for certain. However, I’ve tried disabling their usage of libuv, and the crashes have persisted.

How would I go about getting the line number? Loading the segfault into gdb only provides me with the stacktrace I have posted above. I’m assuming I can only get the line number out if debug information is compiled in, I’m not sure how I would do that in this case

Edit: Just had a look at that thread. One of the stacktraces in there is the exact same as mine, however the command posted in there (:lua require("luv").handle_get_type(newproxy())) also causes a segfault for me, albeit with a different stacktrace.