runtime: Pipelines : suggest reworking behaviour when reader can't progress and writer is throttled

Situation: the reader can’t sensibly consume the existing data, presumably because it needs more data for a semantic frame, but the PauseWriterThreshold has been reached, so the writer is inactive.

What happens today:

System.InvalidOperationException : Advancing examined to the end would cause pipe to deadlock because FlushAsync is waiting.

This is very non-obvious to the casual reader - in particular, it is misleading about the problem:

  • the writer is correctly using the API - they have called FlushAsync() and are behaving properly
  • the reader is correctly using the API - they have signalled what data they have inspected, and that they cannot progress

The actual problem currently is: the PauseWriterThreshold is currently too small for the application’s requirements. IMO the exception should focus on this, rather than blaming the reader.


A casual “solution” might be:

oh, the reading code just needs to consume some of the data in it’s own backlog

but to my mind that completely defeats the point of pipelines - the emphasis being that the pipe is the backlog if you can’t do something partial; the workaround - especially for binary - would ultimately be either to duplicate pipelines functionality, or start allocating / leasing separate buffers. So I don’t think that’s a good solution


My preferred solution here would be: if the reader has touched the right-hand edge (i.e. they’ve signalled that they’ve examined the range and can’t use it) - specifically, something like:

reader.AdvanceTo(..., buffer.End);

(the buffer.End being the important bit here), then ReadAsync is allowed to awaken the writer, and the writer is allowed to push more data; when they next FlushAsync, the reader is allowed to re-test for more data. This is still using the “writer is throttled” model, but it automatically resolves this situation, while allowing pipelines to do all the great stuff re backlog management and avoiding additional allocations.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 11
  • Comments: 50 (40 by maintainers)

Most upvoted comments

I might have lost track of how this is progressing but it seems to me thresholds are getting most of the discussion.

What was the problem with

proposal: add AdvanceTo(SequencePosition consumed, SequencePosition examined, long bytesNeeded) overload; implements [magic.gif]

?

In this situation, we are looking for a way to go beyond FULL. To this degree, I would say AdvanceTo(SequencePosition consumed, SequencePosition examined, long bytesNeeded) provides a better way to finely adjust the over-full-capability. I definitely prefer it over working with limits and it seems to me it’s more sustainable long term as the extended limit could be relinquished on next consume automatically. No, I don’t want to even think with what this implies as pinned hardware buffers are in the mix, if you use them.

Or just disable the limits until further notice and monitor the memory.

The one thing I want to reiterate about that as a workaround is that it allows the writer to unnecessarily swamp the pipe. If the reader needs more data, I’m fine with that - it is “necessary” at that point; but that will often be the exception not the rule, and there’s no need for the pipe to be stuffed full the rest of the time - which upping or removing the quota may achieve.

Note: it probably makes sense for there to be two settings instead of just the one:

  • the threshold for when the reader isn’t keeping up
  • the absolute maximum pipe size (at which an error legitimately occurs)

The current setting is essentially acting like the second of these, although in many ways it claims to be the first. Since people hitting this issue will have changed that value to fix it, I suggest that it would make sense to keep this setting meaning “the absolute maximum”, but changing the default to something much bigger, like 1GB; the new setting the would take the current default of 32kB.

Possibly. Or not. Just thinking aloud here.

We’re going to be making this change as part of 3.0 https://github.com/dotnet/corefx/issues/35703. Closing this issue out. Thanks for the discussion. TL;DR We’re going to base the limits on examined instead of consumed. The pipe will buffer and max limits need to be moved into the parser.

I suspect if you do that 90% of these reports will just disappear.

I do wonder if the default limit is too small now. Maybe we should bump it to 1MB or 4MB

Or just disable the limits until further notice and monitor the memory.

Should I start a proper API proposal for it?

proposal: add AdvanceTo(SequencePosition consumed, SequencePosition examined, long bytesNeeded) overload; implements [magic.gif]

And it removes the need for a separate configuration option that adds confusion. Ooh, this would be a great two-for-one 😃

To rephrase this:

The point of the threshold is to slow the writer when the reader isn’t keeping up, to prevent it over-filling a pipe without reason.

But if the reader has “examined” the end, they are keeping up: they’re basically saying “keep it coming, I need more! MOAR!”. That’s why it is a slightly different situation, and one that IMO should be allowed.

It is totally legitimate to throttle the writer if the reader hasn’t examined “end” - that’s the reader’s fault. But that isn’t the scenario that I’m talking about.