luv: Frequent segfaults in `push_fs_result`
I’m using the nightly neovim build and am regularly encountering segfaults. I haven’t yet 100% narrowed down when they occur, but it mostly seems to be when files are changed while the editor is opened.
This is what the stack trace looks like:
#0 0x00007f7c9db4e939 push_fs_result (libluv.so.1 + 0xf939)
#1 0x00007f7c9db542b3 luv_fs_cb (libluv.so.1 + 0x152b3)
#2 0x00007f7c9d9333fd uv__work_done (libuv.so.1 + 0xc3fd)
#3 0x00007f7c9d9370cd uv__async_io.part.0 (libuv.so.1 + 0x100cd)
#4 0x00007f7c9d94ae6c uv__io_poll (libuv.so.1 + 0x23e6c)
#5 0x00007f7c9d937a14 uv_run (libuv.so.1 + 0x10a14)
#6 0x0000000000538948 loop_uv_run (nvim + 0x138948)
#7 0x0000000000643cad inbuf_poll.lto_priv.0 (nvim + 0x243cad)
#8 0x0000000000643eed os_inchar (nvim + 0x243eed)
#9 0x00000000006cdaad state_enter (nvim + 0x2cdaad)
#10 0x0000000000609214 normal_enter (nvim + 0x209214)
#11 0x00000000004552d0 main (nvim + 0x552d0)
#12 0x00007f7c9d73d24e __libc_start_call_main (libc.so.6 + 0x2924e)
#13 0x00007f7c9d73d309 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x29309)
#14 0x0000000000457585 _start (nvim + 0x57585)
I’m not sure what else to include here, so please tell me if there’s any additional information you require.
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 16 (10 by maintainers)
Commits related to this issue
- fix: avoid `dir` be gc early. close #644 — committed to zhaozg/luv by zhaozg 10 months ago
- fix: avoid dir be gc early. close #644 — committed to zhaozg/luv by zhaozg 10 months ago
- fix: avoid dir be gc early. close #644 — committed to luvit/luv by zhaozg 10 months ago
Nevermind, the stack trace is the same as the neovim one if I run it with LuaJIT (I was using PUC Lua since sometimes that makes things easier to debug):
Reproduced
Let’s do some analyze.
uv.fs_opendir
result callback, by newuserdata to createluv_dir
, by newuserdata to createluv_dir->handle->dirents
and setluv_dir->dirents_ref
todirents
.luv_dir->dirents_ref
be unref inuv.fs_closedir
orluv_fs_dir_gc
, causedirents
gc
to invalid.fs_readdir
,luv_dir
mybe gc beforefs_readdir
be called.luv_dir
infs_readdir
, and unref in readdir callback, avoid lostdirents
memory.Pay attention to https://github.com/neovim/neovim/issues/21413#issuecomment-1684564391
Yeah, libuv didn’t have
separateDebugInfo
, but I managed to enable it myself by overridinglibluv
in the rust flake and settings the cmake build type as well asdontStrip
(that last one took a bit to figure out…), and I havelibluv
with debug symbols now. Turns out my last crash was so long ago that coredumpctl already cleaned out the stack traces though, so I’ll have to wait for the next crash to get you that line number 😛Thanks a lot for the offer still! I learned a lot about overriding things in nix, and I can at least do it for separate targets now. My current way would be building libluv by itself with debug symbols, stripping them out using
objcopy
and then loading them dynamically incoredumpctl
withgdb
. Writing an overlay to modify thelibluv
that neovim builds with would probably be a lot easier, but I haven’t done a deep dive into how overlays work yet.Mhm, no luck so far I’m afraid. I’ve tried compiling the debug symbols separately and loading them into gdb, but the nixpkgs version seems to be different since I’m getting bogus line numbers. I’m not too experienced with overriding nixpkgs either, I’ll try to get some help on the forums for that. Man, nix is amazing when it works, but it makes things like these so complicated…
Thanks for your patience!
I’ll try that tomorrow and report back. Thanks for the link!
I’m 99% sure it happens when I have neo-tree enabled, but I can’t say for certain. However, I’ve tried disabling their usage of
libuv
, and the crashes have persisted.How would I go about getting the line number? Loading the segfault into gdb only provides me with the stacktrace I have posted above. I’m assuming I can only get the line number out if debug information is compiled in, I’m not sure how I would do that in this case
Edit: Just had a look at that thread. One of the stacktraces in there is the exact same as mine, however the command posted in there (
:lua require("luv").handle_get_type(newproxy())
) also causes a segfault for me, albeit with a different stacktrace.