runtime: Out Of Memory From Pinned byte[] used in SslStream

Description

A rented byte[] in SslStream ends up pinned for I/O, contributing to OutOfMemoryExceptions under high memory pressure scenarios.

The scenario we’ve encountered is:

  1. Our application dedicates ~50% of its heap to a cache
    • Over time most of this ends up in Gen2, but at any point in time some of it is in Gen1 and Gen0.
    • We run with a hard memory limit imposed by a Job.
  2. The application is also handling a lot of http requests, with underlying SslStreams.
  3. As part of handling these requests some large object heap (LOH) and pinned object heap (POH) allocations are sometimes needed.
    • Most of the POH allocations actually come from Kestrel code.
  4. Occasionally a LOH or POH allocation triggers a GC, but fails to free up enough space for the allocation in the existing heap and so the GC attempts to reclaim space from the ephemeral segment to grow the heap.
  5. Despite the ephemeral segments being mostly free space, this fails because of pinned arrays obtained in that ResetReadBuffer() method.
    • More precisely, the ephemeral segments are shrunk as far as the pins allow but enough space is not freed up.
  6. Runtime raises an OutOfMemoryException, and our application crashes.

I waffled between whether this is a bug or a performance concern. Everything is technically functioning correctly, so I settled on performance concern.

Configuration

This is observed under various .NET 6 point releases, running under Windows, on Intel x64 hardware. We are using the concurrent server GC.

This appears to be possible <strike>under all OS</strike> (per discussion below, this is Windows specific) on Windows for all hardware combinations, but we have not reproduced it elsewhere.

Regression?

This is not a regression, however recent versions of .NET introduce the pinned object heap (POH) which offers a mitigation.

Data

We diagnosed this by looking at crash dumps and observing that pinned byte[]s with 32,768 elements were always at the end of the ephemeral segments, and then using Perfview to try and catch pinning in the act. That led us to SslStream, and then a close reading of the source found an appropriately sized array (recall that ArrayPool rounds up to nearest power of 2, so ReadBufferSize == 4_096 * 4 + FrameOverhead == 16_448 will get a byte[]s with 32,768 elements).

Analysis

While our particular issue appears to be caused by that array, in theory any byte[] (which is not on the POH) used for I/O in SslStream could also cause it.

Starting in .NET 5.0 the POH is available, which enables keeping pins out of the ephemeral segments. Kestrel has adopted the POH (via its PinnedBlockMemoryPool) for much of it I/O, which neatly avoids this problem (while also providing some performance benefits, potentially).

I’m not familiar enough with SslStream’s implementation to say whether adopting a similar approach as a simple drop in is viable, so I have not created a PR.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 26 (19 by maintainers)

Most upvoted comments

@stephentoub it is slowly inching its way to production - testing suggests its effective, but we won’t know definitively for a bit longer.

We’ve had this in production long enough, and gathered enough dumps, to say that zero-byte-read option has removed these pins.

Triage: Looks like a symptom that might justify changes in our SslStream - either 0-byte reads or renting buffers based on frame size. We should investigate in 8.0.

yeh, that’s possible. if you have pins that have “stretched” the heap out and those pins are still alive, we could do a full blocking GC and since we can’t move those pins they will be there. and that means all the space before the last pin on the segment will be considered in use (since we can’t let go of that free space).

this situation will be better with regions. if you are using .net 6.0 already I can give you a clrgc.dll with the 7.0 GC implementation that you can load with an env var with your 6.0 build. are you interested in trying that?

I only read the last 2 comments but just wanted to point this out -

That Gen0 almost entirely free space hints that the GC recently ran a full collection.

this is very common with a collection of any generation, not just a full collection. as long as it’s a compacting GC (and most ephemeral GCs are compacting GCs), you’d observe this when you have pins in gen0.

Just to be clear, you’re suggesting we switch

Correct, at least as an experiment.

It’s a tradeoff, of course, but if memory is your issue, it’s likely a good one. This goes beyond pinning, as it’s really about how much memory is consumed at any point in time.

One thing to keep in mind is that different streams can have different behaviors when you request 0 bytes. All of the relevant ones from the core libraries should now treat that as meaning “only complete when there’s some data available but don’t consume any of it”, but others might instead treat it as “complete immediately because the caller asked for nothing”. So if stream in your example isn’t SslStream, you’ll want to make sure it also supports the concept of zero-byte reads.