azure-sdk-for-java: [BUG] Possible Memory Leak in Storage Blobs

Filing this here on behalf of @jdubois

This issue was reported by a customer doing large selection of blobs.

This is using the new API for blob storage. Please note that those are my notes from discussing with the client, I haven’t reproduced the case myself (for obvious reasons, you need a lot of blobs!), but I am pretty confident he is correct.

Doing a “blobContainerClient.listBlobs().streamByPage()” is causing memory leaks when you have a huge number of blobs, so we probably have some pointer left somewhere there.

Here is his solution, that do indeed work in production:

  • use streamByPage and not iterableByPage, as it has a close method
  • force clean the body of HTTP requests using the Reactor API, using response.getRequest().setBody(Flux.never());
  • disable the buffer copies in Netty: new NettyAsyncHttpClientBuilder().disableBufferCopy(true).build();

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 22 (10 by maintainers)

Most upvoted comments

Dear awesome MS team, a quick follow up to let you know we’ve upgraded our compliance scan app to fully use reactive, parallel and paging capabilities of the SDK (ParallelFlux<PagedResponse<BlobItem>>), and we’re now able to browse 19 110 738 files over 330 terabytes in less than one hour.

You’ve built a finely tuned piece of software, thanks 👏

Fixed in #15929

@alzimmermsft I’ve run that and went through 49670 pages before I gave up. The memory chart formed nice healthy saw pattern. So it seems that the fix solves this issue as well.