tungstenite-rs: Following an usual non-blocking pattern implementation adds extra time.
Hi, first of all, thank you for your amazing work in tungstenite-rs!
I am using your library with non-blocking sockets, and all is working fine. Nevertheless, when I compare times among reading 1-byte in blocking and non-blocking ways I noticed that the non-blocking was around twice as slow: 8us for reading 1-byte in a blocking schema and around 15us reading in a non-blocking schema.
Investigating about it, this double increment comes from in non-blocking schema I’m “force” to call WebSocket::read_message() twice. Once to receive the byte I sent, and other call to receive the WouldBlock that notify me that there are no more messages:
loop {
match websocket.read_message() {
Ok(message) => (), // 1-byte received.
Err(Error::Io(ref err)) if err.kind() == ErrorKind::WouldBlock => break,
//...
}
}
Currently, I fixed this to avoid the second call to read_message() by checking by my self if there is data in the stream or the socket would block:
loop {
match websocket.read_message() {
Ok(message) => {
// 1-byte received
if let Err(err) = web_socket.get_ref().peek(&mut [0; 0]) {
if err.kind() == ErrorKind::WouldBlock {
break;
}
}
}
Err(Error::Io(ref err)) if err.kind() == ErrorKind::WouldBlock => break,
//...
}
}
With the above code, I correctly read a 1-byte message in 8us instead of 15us, but is far to be obvious for a non-blocking user, that is used to perform this kind of reading pattern until getting WouldBlock.
Taking a look inside read_message() I saw that there is a lot of things done inside. Maybe these things should be done only in the case read_message() has data to read and if not, perform an early exit with WouldBlock.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 17 (8 by maintainers)
You’re welcome! If the questions have been cleared, feel free to close the issue 😉
That’s roughly what we do. Though we can’t remove
WouldBlockunless we got it from the underlying reader. But if you have such extreme optimisation needs (microseconds), I would suggest to look into thetokio-tungstenite. Tokio has built the right and optimal way to work with non-blocking I/O for you, so that theread_message()and co are not called when they are not ready.