azure-storage-net: BlobWriteStream.Dispose() causes thread starvation under load

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the SDK was used?

9.3.0

Which platform are you using? (ex: .NET Core 2.1)

.NET Core 2.1

What problem was encountered?

I’ve been troubleshooting a scalability issue with our ASP.NET Core Web API, which makes heavy use of blob storage for underlying storage.

We observed:

  • Massive sudden increase in response times above a certain number of concurrent clients
  • No corresponding increase in CPU usage
  • No corresponding increase in response time from blob storage

Long story short, through profiling I have finally pinpointed the issue to be here:

https://github.com/Azure/azure-storage-net/blob/38425e715e1bcdb4cab344bcb9b448c08bf8af5c/Lib/WindowsRuntime/Blob/BlobWriteStream.cs#L200-L220

The problem is the CommitAsync().Wait(). Since the dispose pattern in .NET is not async, this stream implementation of course has to block on the commit operation, and the commit is necessary to ensure that the written blocks are committed to blob storage. This blocks the thread for the duration of that commit operation, which under load leads to thread starvation.

Have you found a mitigation/solution?

One solution would be for our service to do await stream.CommitAsync(); before disposing the stream, but unfortunately BlobWriteStream is marked as internal so this type is not visible to our code. The only solution I can think of is to make BlobWriteStream public, allowing client code to asynchronously commit the stream before disposing it.

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 1
  • Comments: 16 (5 by maintainers)

Most upvoted comments

@asorrin-msft You are right - that should work without any code change actually, because BlobWriteStream inherits from CloudBlobStream. So from client code we should be able to do something like:

using (var stream = await blob.OpenWriteAsync())
{
  // Do some writing to the stream here...

  if (stream is CloudBlobStream blobStream)
    await blobStream.CommitAsync();
}

The issue of discoverability is something I’ve also been thinking about. An XML doc comment on ´OpenWriteAsync()` will not go very far I’m afraid.

Might one consider taking this so far as to actually throw from Dispose() if the stream has not been committed, thus always forcing the pattern of committing (asynchronously) before disposing? At least that way, it will be a fail-fast and obvious thing very early in the dev lifecycle, instead of a very elusive and hard-to-diagnose scalability problem 18 months down the road when the application is already in production…

@justinSelf Duh, of course, stupid of me. The CommitAsync() happening as part of Dispose() was the whole reason why I opened this issue in the first place. 🤣