async: puts '' inside Async block causes crash

Hi, I get a crash on Mac with this snippet:

require 'async'
Async { puts '' }

=>

config/private/async.rb:4: [BUG] rb_sys_fail_path_in(io_writev, <STDOUT>) - errno == 0
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]

-- Crash Report log information --------------------------------------------
   See Crash Report log file in one of the following locations:             
     * ~/Library/Logs/DiagnosticReports                                     
     * /Library/Logs/DiagnosticReports                                      
   for more details.                                                        
Don't forget to include the above Crash Report log file in bug reports.     

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0026 e:000025 CFUNC  :write
c:0006 p:---- s:0023 e:000022 CFUNC  :puts
c:0005 p:---- s:0020 e:000019 CFUNC  :puts
c:0004 p:0005 s:0015 e:000014 BLOCK  config/private/async.rb:4
c:0003 p:0010 s:0012 e:000011 BLOCK  /Users/emiltin/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:158
c:0002 p:0008 s:0009 e:000007 BLOCK  /Users/emiltin/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:310 [FINISH]
c:0001 p:---- s:0003 e:000002 DUMMY  [FINISH]

-- Ruby level backtrace information ----------------------------------------
/Users/emiltin/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:310:in `block in schedule'
/Users/emiltin/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:158:in `block in run'
config/private/async.rb:4:in `block in <main>'
config/private/async.rb:4:in `puts'
config/private/async.rb:4:in `puts'
config/private/async.rb:4:in `write'

-- C level backtrace information -------------------------------------------
/Users/emiltin/.rbenv/versions/3.2.2/lib/libruby.3.2.dylib(rb_vm_bugreport+0x9a0) [0x1036644d8]
/Users/emiltin/.rbenv/versions/3.2.2/lib/libruby.3.2.dylib(rb_bug_without_die+0x15c) [0x1034a0384]
/Users/emiltin/.rbenv/versions/3.2.2/lib/libruby.3.2.dylib(rb_bug+0x1c) [0x103670f40]
/Users/emiltin/.rbenv/versions/3.2.2/lib/libruby.3.2.dylib(rb_syserr_new_path_in.cold.1+0x58) [0x10367125c]
/Users/emiltin/.rbenv/versions/3.2.2/lib/libruby.3.2.dylib(rb_syserr_new_path_in+0xcc) [0x1034a48c4]

Only the empty string ‘’ causes a crash.

I guess this indicates a problem with CRuby?

Versions:

code/rsmp (supervisor_debug) % ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]

code/rsmp (supervisor_debug) % gem list async

*** LOCAL GEMS ***

async (2.5.0)
async-io (1.34.3)

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 21 (12 by maintainers)

Most upvoted comments

I merged a fix in https://bugs.ruby-lang.org/issues/19640

I will remove the io_write hook in Async on versions of Ruby where it causes a crash.

Work went into the 6.4 kernel to improve this situation a bit, most notably this merge:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b9dff2195f8a5847fad801046b26955e05670d31

which converts single segment imports into ITER_UBUF, and Linus himself authored some commits improving the import speed of iovecs in general. Most for x86, but also this generic one:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=487c20b016dc48230367a7be017f40313e53e3bd

Doesn’t change the advice I gave above, but it does reduce the overhead on the kernel side a bit in 6.4+ for that particular use case.

Just the iovec itself, not the data. io_uring has to ensure that any “meta data” related to the IO is stable beyond submit, so it has to copy in and store the vec as it cannot reimport it safely later on without violating that requirement.

writev is always worse than write, regardless of whether you use io_uring or not. writev needs to copy an iovec, which write does not. Internally in the kernel, outside of the copy overhead, this means that (until recently) it would also use an ITER_IOVEC when passed down the stack, which are more expensive to advance and iterate.

For io_uring specifically, a vectored write also means that io_uring now has to maintain that state. If things complete inline (eg without needing poll deferral), the cost overhead is the same as writev(2) in that it has to do that extra copy. If io_uring does not need defer the request, then it now has to allocate memory to store this iovec state, and free it at the end. This all adds up to more overhead.

IOW - don’t use vectored IO, unless you have to. Lots of applications seem to happily just use vectored IO by default even if they are just using a single segment, this is a bad idea. Obviously there are cases where you’d want to use single segment vectored IO, for things like sendmsg() for example. But if that’s not the case, don’t do it.

I think I found the issue.

Haha that’s super weird lol, after RubyKaigi this week, I will take a look.

The semantics of writev are (unexpectedly) more complex than write.

I heard some implementations of writev just make a new buffer, copy everything together, and call write. So some developer may overestimate the value of writev. I asked @axboe about this and he mentioned, IIRC, that writev is worse than write when using io_uring. However, I’m sure it’s difficult problem to generalise about.

In any case, I agree “good god does a lot go into a single call of rb_io_puts” and perhaps we should try to simplify the code - we could consider removing writev.

Ahh, I see. Now that I look at the patch, and reread io_binwritev_internal as well, I understand why the loop wasn’t necessary, but good god does a lot go into a single call of rb_io_puts.

Okay, yeah. That’s an oops. I assume you’re talking about the loop.

Edit:

Assuming my understanding of this section is correct:

--- work/ruby-heads-master/io.c	2023-05-13 02:59:15
+++ io.c	2023-05-13 11:27:31
@@ -1329,13 +1329,15 @@
 {
     VALUE scheduler = rb_fiber_scheduler_current();
     if (scheduler != Qnil) {
+        ssize_t retval = 0;
         for (int i = 0; i < iovcnt; i += 1) {
             VALUE result = rb_fiber_scheduler_io_write_memory(scheduler, fptr->self, iov[i].iov_base, iov[i].iov_len, 0);
 
             if (!UNDEF_P(result)) {
-                return rb_fiber_scheduler_io_result_apply(result);
+                retval += rb_fiber_scheduler_io_result_apply(result);
             }
         }
+        return retval;
     }
 
     struct io_internal_writev_struct iis = {

Upstream’s responsibility though.

I believe that’s because nil coerces to "". You can also crash it with:

require 'async'

class Foo
  def to_s
    ""
  end
end

tmp = Foo.new

Async { puts tmp }

puts nil also cause crash.