rusoto: Understanding slow performance a Kinesis client
I am very new to Kinesis and very new to Rusoto, so take this report with a grain of salt! I am writing to a Kinesis topic with 20 shards and I cannot seem to get past 4-7MB/s.
My code can be found here: https://github.com/tureus/log-sloth/blob/master/src/bin/log-sloth.rs#L236-L255
My server receives data on TCP, parses it with nom, and sends out 500 record batches to my Kinesis topic. When I run without outputting to Kinesis, just parsing then dropping on the floor, I can get 300MB/s+. I am also running on a big AWS host, a m4.4xl (16 CPUs, 64GB of RAM). CPU utilization is also very low, the one thread handling the client is pinned a measly 5-15% CPU use. I’m pretty confident the Kinesis client (or more succinctly, how I use it) is my current bottleneck.
I have also written a stress test tool, it just takes a hard-coded test string and sends it out as fast as possible. That’s where I see the 300MB/s+ metric.
I have enabled LTO and I run in release mode. My server seeds a “Pooled stream disconnected” exception but I can just retry my stress test tool and get it flowing again.
From my server:
Running `target/release/log-sloth server`
INFO:<unknown>: STARTING: thread for client stream=V4(127.0.0.1:49124)
thread '<unnamed>' panicked at 'could not write data to kinesis stream: HttpDispatch(HttpDispatchError { message: "Pooled stream disconnected" })', /checkout/src/libcore/result.rs:906:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.
INFO:<unknown>: STARTING: thread for client stream=V4(127.0.0.1:49136)
INFO:<unknown>: stream=V4(127.0.0.1:49136) done. 697 MB bytes, 1000000 lines
INFO:<unknown>: STOPPING: thread for client ending with Ok(())
From my stress test:
root@rust-459207332-93w7g:/log-sloth# time cargo run --bin stress --release
Finished release [optimized] target(s) in 0.0 secs
Running `target/release/stress`
done! wrote 698 MB
real 3m14.344s
user 0m0.260s
sys 0m0.460s
So, it’s slow! I’d say LTO even made it a little slower. Any tips on how to spot the problem?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 26 (25 by maintainers)
I’m shoveling data now! Seems rock solid:
@xrl It‘s been a while, so I might not be up to date, but you should be able to set a breakpoint on
rust_panic(break rust_panicin gdb), which should pause on a panic, before any of the actual unwinding happens.