actix-web: A Stream object in a pending state won't be dropped after the client disconnect

I’m not sure if it was intended, but a Stream object in a pending state won’t be dropped after the client disconnects.

I found this issue in actix-web 2.0.0 and created a small test program like below: https://github.com/masnagam/rust-case-studies/blob/master/actix-web-streaming-and-tcp-disconnect/src/main.rs

When executing the following command:

timeout 3 curl -v http://localhost:3000/pending

we never see PendingStream: Dropped in the log messages from the test program.

If a stream returns a Poll::Ready, it will be dropped shortly. We can see that by executing:

timeout 3 curl -v http://localhost:3000/alternate

Is there any workaround to stop polling the pending stream shortly when the client disconnects?

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 52 (40 by maintainers)

Commits related to this issue

Most upvoted comments

As it happens, I’m looking to clean up the remaining issue for v4.4 release, including this one, in the coming few days. I’ll be assessing the open PRs that claim to solve this issue. Hopefully they include good test cases 🤞🏻.

I’ve created tokio-rs/tokio#2743 which restores TcpStream::{poll_read_ready, poll_write_ready}.

Investigated how actix-web detects client disconnect.

When the client disconnects, poll_write() in InnerDispatcher::poll_flush() returns Poll::Ready(Ok(0)).

When the write_buf is an empty, InnerDispatcher::poll_flush() returns Ok(false) immediately. This is the reason why actix-web cannot detect the client disconnect while the stream is pending.

We may detect the client disconnect quickly if we can check it like below:

impl<T, S, B, X, U> Future for Dispatcher<T, S, B, X, U>
{
    fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        ...
                    loop {
                        ...
                        if inner.check_disconnect() {
                            // client disconnected
                            return Err(...);
                        }

                        if inner.poll_flush(cx)? || !drain {
                            break;
                        }
                    }
    }
}

This seems like a pretty serious issue in actix-web, making it hard to use in scenarios that require streaming data continuously over HTTP (an increasingly popular thing to do nowadays).

@robjtede , based on this thread, when this issue was originally created three years ago there really was not a good way to fix it. Do you think things have changed since? Do you see a solution to this?

I notice there is another PR by @simorex80 #2830 that claims to fix this bug. @simorex80 and @robjtede , do you believe the PR solves the issue? Is it just a matter of adding a test for it? If so, I can look into writing such a test.

Again, this seems like a pretty significant gap in the otherwise excellent framework. I’d love to see it fixed, so that users like myself don’t need to resort to ugly workarounds.

Hi @ryzhyk , I can confirm that we are using the fix #2830 in production from more than one year. (overriding cargo source definition) Currently we didn’t find any issue both for normal http request and for SSE request (SSE was was the reason for this fix).

How you can reproduce the issue:

  1. create an actix_web method for handle a SSE request (server send event)
  2. connect to the specific url using Chrome (but also with Edge)
  3. after a while close the browser
  • Without the fix the connection on the server still exists and it will never be dropped.
  • With the fix after a while the connection on the server will be dropped and the memory is freed.

This is the scenario we faced and I fixed with that commit. Due the scenario replication complexity if you can help us to create the test case it whould be great!

@masnagam, did you manage to solve this or perhaps you have any ideas?

@ryzhyk There might be some workaround but we simply switched to axum. mirakc/mirakc#223 may help you.

@masnagam

[dependencies]
actix-web = { git = "https://github.com/actix/actix-web.git", default-features = false, branch = "feature/h1_pending_timer" }

[patch.crates-io]
actix-http = { git = "https://github.com/actix/actix-web.git", branch = "feature/h1_pending_timer" }
awc = { git = "https://github.com/actix/actix-web.git", branch = "feature/h1_pending_timer" }
actix-rt = { git = "https://github.com/actix/actix-net.git", branch = "feat/net_poll_ready" }
tokio = { git = "https://github.com/fakeshadow/tokio", branch = "feature/net_poll_ready" }

With these forks I’m able to detect disconnection with your example of PendingStream

And, sorry for my late response… (I should study English harder)

Don’t be sorry. I’m not a native English speaker either so we are in the same boat.

The current implementation of actix-web is complex enough. So, introducing new complexity is not good idea in the maintainability point of view.

Adding a notifier for poll_read/write_ready would not add too much overall complexity. It’s more about the nature of a giant future make me worry. Adding more point to wakeup a gaint future could have some overhead(I’m not sure about this so I could be wrong).

So, it’s acceptable at least for me to close this issue as a limitation.

This issue is worked on and is part of the actix-web v4 milestone so please keep it open. I’m just offering my personal view and it doesn’t interfior with the process of this issue.

Note that the relevant Tokio PR has been merged and methods are available.

https://docs.rs/tokio/1.0.1/tokio/net/struct.TcpStream.html#method.poll_read_ready

@robjtede Yes, you can. That’s acceptable at least to me.

Thanks for all the investigation. I’ve run into it as well and ideally the solution would be similar to what Go does, ie. wait for closing a read stream (ie. using TcpStream::poll_read_ready()). I’m not sure how many people care about it, but right now it’s kind of a differentiator between Go’s net/http and Tokio based ecosystem, ie. if a knowledge about a disconnect is needed, like for a presence system, Tokio based libraries (including actix-web) are not the best choice.

I haven’t investigated further. Great work finding out. It’s comforting that the Keep-Alive checks were the cause of the polls.

Are we considering this correct behavior?

I concluded that it’s difficult to detect client disconnects while streams are pending.

tokio’s TcpStream seems to be based on Rust’s TcpStream which has a traditional socket interface. TcpStream’s read/write function call with an empty buffer returns Ok(0). So, it’s impossible to distinguish between client disconnect and success.

Issue#29 in the tokio project is a similar topic.

I checked behavior of tokio::io::PollEvented::poll_write_ready() on Linux. I analyzed my test program by using VS Code debugger and found that this function doesn’t detect client disconnect. tokio::net::TcpStream::poll_write_priv() reached mio::net::TcpStream::write() which detected the client disconnect.

Read/Write with non-empty buffer can detect the client disconnect. But that breaks the HTTP request pipelining and the HTTP response.

There might be a platform specific workaround, but there is no useful functions in crates that actix-web uses at this moment.

Finally, I decided to feed some data at short intervals from my stream implementation in order to detect client disconnect quickly.

Investigated about the periodic PendingStream::poll() calls.

The root cause of the periodic PendingStream::poll() calls is a timer for the keep-alive check which is enabled by default. See: https://github.com/actix/actix-web/blob/master/actix-http/src/config.rs#L263

We can disable it by setting keep_alive(0). In this case, the periodic PendingStream::poll() calls stop.