tokio-core: Possibly slow channel

Hi, I was playing around with tokio and implemented my goto benchmark for any kind of async distributed language: Pass a number around a ring of threads and see how how fast it is. In erlang with ring of 100 threads I get can pass the number around +/- 250K times a second. Below is an attempt to implement the same with Tokio channels but I thought the numbers are a bit low. Its possible I have not used you library properly or there is some hidden lock somewhere as my CPU hardly goes above 180% no matter how few/many threads I spawn. It takes about 14 seconds to send the number around a loop of 2 threads 1000000 times or around 77k sec.

Again I won’t be surprised if I have done something wrong. Here is the code snippet:

extern crate futures;
extern crate tokio_core;

use tokio_core::{Sender, Receiver};
use tokio_core::io::IoFuture;
use futures::stream::Stream;
use futures::{Future,finished};
use std::thread::spawn;
use std::io;

fn spawn_aux(trx: Sender<u32>, rx:  IoFuture<Receiver<u32>>) {
    spawn(move || {
        let mut aux_loop = tokio_core::Loop::new().unwrap();
        let future = rx.and_then(|s| {
            s.for_each(|num| {
                trx.send(num + 1)
            })
        });
        aux_loop.run(future)
    });
}

fn main() {
    let mut main_loop = tokio_core::Loop::new().unwrap();
    let (first_trx, mut last_rx) = main_loop.handle().channel::<u32>();
    for _ in  1..2 {
        let (next_trx, rx2) = main_loop.handle().channel::<u32>();
        spawn_aux(next_trx, last_rx);
        last_rx = rx2
    }
    first_trx.send(0).unwrap();
    let future = last_rx.and_then(|s| {
        s.take(1_000_000)
         .fold(0, |_, num|{
             let num = num + 1;
             first_trx.send(num).unwrap();
             finished::<u32, io::Error>(num)
         })
    });
    let res = main_loop.run(future).unwrap();
    println!("res {}", res);
}

available here: https://github.com/josephDunne/ring_test.git

I am really excited for this project going forward!

About this issue

Original URL
State: closed
Created 8 years ago
Comments: 29 (8 by maintainers)

Most upvoted comments

gettime() is called for timeout handling: https://github.com/tokio-rs/tokio-core/blob/master/src/reactor/mod.rs#L229

I don’t necessarily trust it, but perf reports that these instructions take 60% of the _run function:

 10.54 │1280:   cmp    %rbx,%r14
  9.64 │      ↓ je     1510
  9.23 │        cmpq   $0x1,0x8(%rbx)
 22.74 │        lea    0x20(%rbx),%rbx
  9.23 │      ↑ jne    1280

It looks like a linked list, but because of inlining I’m not sure where the code comes from. Could it be the timer wheel? But there’s no timer wheel in the git version. Turns out your Cargo.lock references tokio-core 0.1.0 and the code doesn’t build any more due to API changes and I’m not sure how to port the Receiver part.

On the other hand, tokio-core is using epoll. I would expect other implementations to avoid the system calls, but that’s probably not possible with tokio.

lnicola on Sep 8, 2016

perf record output (above 1%):

  34.71%  ring_test  ring_test           [.] tokio_core::event_loop::Loop::_run::hbb7b1a6371af670b
   6.74%  ring_test  ring_test           [.] _$LT$tokio_core..channel..Receiver$LT$T$GT$$u20$as$u20$futures..stream..Stream$GT$::poll::h7c843bbcb5ce7006
   5.15%  ring_test  ring_test           [.] mio::poll::RegistrationInner::set_readiness::h7d6969f5b9345ebb
   5.02%  ring_test  [vdso]              [.] __vdso_clock_gettime
   3.19%  ring_test  ring_test           [.] tokio_core::event_loop::Loop::notify::hb6f76bb6d7adc9f2
   3.10%  ring_test  libc-2.24.so        [.] __clock_gettime
   2.94%  ring_test  ring_test           [.] _$LT$std..sync..mpsc..spsc_queue..Queue$LT$T$GT$$GT$::pop::hb590db4a944f5333
   2.79%  ring_test  ring_test           [.] _$LT$tokio_core..channel..Sender$LT$T$GT$$GT$::send::hb5ada3e12aa8d089
   2.64%  ring_test  ring_test           [.] std::sys::time::Timespec::sub_timespec::h2335079df0e0f38e
   2.25%  ring_test  ring_test           [.] tokio_core::event_loop::Loop::consume_queue::he35a581719467787
   2.16%  ring_test  [kernel.vmlinux]    [k] entry_SYSCALL_64
   2.10%  ring_test  ring_test           [.] _$LT$std..sync..mpsc..stream..Packet$LT$T$GT$$GT$::send::h1fc93fec210a6124
   1.88%  ring_test  libpthread-2.24.so  [.] __GI___libc_read
   1.78%  ring_test  ring_test           [.] _$LT$futures..stream..fold..Fold$LT$S$C$$u20$F$C$$u20$Fut$C$$u20$T$GT$$u20$as$u20$futures..Future$GT$::poll::h681357b761ac3b22
   1.55%  ring_test  ring_test           [.] tokio_core::event_loop::LoopHandle::send::_$u7b$$u7b$closure$u7d$$u7d$::hc00454a0886f6c75
   1.42%  ring_test  ring_test           [.] tokio_core::event_loop::Loop::run::_$u7b$$u7b$closure$u7d$$u7d$::haf3807496187bedc
   1.31%  ring_test  ring_test           [.] tokio_core::event_loop::Loop::run::_$u7b$$u7b$closure$u7d$$u7d$::hf33a41e8f8c577da
   1.26%  ring_test  [vdso]              [.] 0x0000000000000959
   1.21%  ring_test  libpthread-2.24.so  [.] __pthread_enable_asynccancel
   1.14%  ring_test  libc-2.24.so        [.] __libc_disable_asynccancel
   1.11%  ring_test  ring_test           [.] tokio_core::event_loop::Loop::notify_handle::h73d6451d09088d0f
   1.10%  ring_test  ring_test           [.] sdallocx
   1.05%  ring_test  libpthread-2.24.so  [.] __pthread_disable_asynccancel
   1.02%  ring_test  ring_test           [.] mallocx

lnicola on Sep 8, 2016