azure-sdk-for-java: [BUG] Possible Memory Leak in Storage Blobs
Filing this here on behalf of @jdubois
This issue was reported by a customer doing large selection of blobs.
This is using the new API for blob storage. Please note that those are my notes from discussing with the client, I haven’t reproduced the case myself (for obvious reasons, you need a lot of blobs!), but I am pretty confident he is correct.
Doing a “blobContainerClient.listBlobs().streamByPage()” is causing memory leaks when you have a huge number of blobs, so we probably have some pointer left somewhere there.
Here is his solution, that do indeed work in production:
- use
streamByPageand notiterableByPage, as it has aclosemethod - force clean the body of HTTP requests using the Reactor API, using
response.getRequest().setBody(Flux.never()); - disable the buffer copies in Netty:
new NettyAsyncHttpClientBuilder().disableBufferCopy(true).build();
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 22 (10 by maintainers)
Dear awesome MS team, a quick follow up to let you know we’ve upgraded our compliance scan app to fully use reactive, parallel and paging capabilities of the SDK (
ParallelFlux<PagedResponse<BlobItem>>), and we’re now able to browse 19 110 738 files over 330 terabytes in less than one hour.You’ve built a finely tuned piece of software, thanks 👏
Fixed in #15929
@alzimmermsft I’ve run that and went through 49670 pages before I gave up. The memory chart formed nice healthy saw pattern. So it seems that the fix solves this issue as well.