azure-sdk-for-java: [QUERY] How to alleviate Timeouts in List Blobs operation?
Query/Question How to alleviate Timeouts in List Blobs operation?
The timeout is set for 30s which is the max permissible for Blob Service ( as per the Azure documentation ). The max number of keys for listing ( maxResultsPerPage
) is the default 5000
. The buckets being listed are large buckets with 100k+ objects.
I know that adding a retry is another possibility but would prefer if there was another alternative.
The timeout exception is given below
Caused by: reactor.core.Exceptions$ReactiveException: java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 30000ms in 'flatMap' (and no fallback has been configured)
at reactor.core.Exceptions.propagate(Exceptions.java:393) ~[observer-3.20.92.jar:na]
at reactor.core.publisher.BlockingIterable$SubscriberIterator.hasNext(BlockingIterable.java:168) ~[observer-3.20.92.jar:na]
at reactor.core.publisher.BlockingIterable$SubscriberIterator.next(BlockingIterable.java:198) ~[observer-3.20.92.jar:na]
at kdc.cloudadapters.adapters.MicrosoftAzureAdapter$AzureListRequest.nextBatch(MicrosoftAzureAdapter.java:566) ~[observer-3.20.92.jar:na]
... 9 common frames omitted
Caused by: java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 30000ms in 'flatMap' (and no fallback has been configured)
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.handleTimeout(FluxTimeout.java:289) ~[observer-3.20.92.jar:na]
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.doTimeout(FluxTimeout.java:274) ~[observer-3.20.92.jar:na]
at reactor.core.publisher.FluxTimeout$TimeoutTimeoutSubscriber.onNext(FluxTimeout.java:396) ~[observer-3.20.92.jar:na]
at reactor.core.publisher.StrictSubscriber.onNext(StrictSubscriber.java:89) ~[observer-3.20.92.jar:na]
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:73) ~[observer-3.20.92.jar:na]
at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:117) ~[observer-3.20.92.jar:na]
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68) ~[observer-3.20.92.jar:na]
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28) ~[observer-3.20.92.jar:na]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_252]
... 3 common frames omitted
Additional information
A shorter variation of the code used is given below
class AzureList {
BlobContainerClient container;
Iterator<PagedResponse<BlobItem>> iterator;
String continuationToken;
public AzureList(String accountName, String accountKey, String bucketName) {
StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey);
String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
BlobServiceClient serviceClient = new BlobServiceClientBuilder().credential(credential)
.endpoint(endpoint)
.buildClient();
container = serviceClient.getBlobContainerClient(bucketName);
iterator = getIterator(/*prefix*/ ""); // Current use case is just "" as prefix but can be different in the future
continuationToken = null;
}
private Iterator<PagedResponse<BlobItem>> getIterator(String prefix) {
ListBlobsOptions options = new ListBlobsOptions().setPrefix(prefix);
return container.listBlobs(options, continuationToken, Duration.ofSeconds(30L)).iterableByPage().iterator();
}
public void iterate() {
List<BlobItem> blobs;
do {
blobs = listBlobs();
// hand off blob list to different consumer class
} while (continuationToken != null);
}
private List<BlobItem> listBlobs() {
PagedResponse<BlobItem> pagedResponse = iterator.next();
List<BlobItem> blobs = pagedResponse.getValue();
continuationToken = pagedResponse.getContinuationToken();
return blobs;
}
}
Why is this not a Bug or a feature Request? Unsure if it is a Bug or an issue with my local env / my code.
Setup (please complete the following information if applicable):
- OS: Ubuntu 18.04
- IDE : IntelliJ 19.1.4
- SDK: azure-storage-blob v12.7.0
Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
- Query Added
- Setup information Added
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 65 (33 by maintainers)
I see. Makes sense. Since Feb was a beta release, I expect March will be a GA release
@somanshreddy We have released a new GA version of the SDK. Could you please give it a try and see if it addresses your problem? If it does, could you also please close the issue?
This should be out by february
Appears this was fix by azure-core 1.11.0 and azure-core-http-netty 1.7.0, azure-storage-blob 12.9.0 has dependencies on versions 1.10.0 and 1.6.3 so it won’t have the fix. If you include the newer versions of azure-core and azure-core-http-netty in your project directly they will be used in place of the versions that azure-storage-blob depends on, and this will be safe as the newer versions are backward compatible. Once a newer version of azure-storage-blob depending on the fix versions, or newer, is available you should be able to remove the direct dependencies on azure-core and azure-core-http-netty.
Hi @somanshreddy, I just merge this PR (https://github.com/Azure/azure-sdk-for-java/pull/17699) which should have the HTTP client eagerly read the response body when we know it will deserialized. This should reduce the number of occurrences when a TimeoutException or PrematureCloseException are thrown from the SDK by completing more of the HTTP response consumption within the scope of our retrying logic. These changes should be available from Maven after our next SDK release.
Yes, DEBUG should include information about the number of connections active and inactive within the connection pool and contain other information surrounding requests and responses.
@somanshreddy We are hoping to release it as a part of our November release in a couple a week or two.
@somanshreddy, I’ve taken a look into the exception being returned and not retried. Write and response timeouts will be retried when they occur due to them happening on sending the request and awaiting the response, read timeouts may be retried when they occur.
Read timeouts don’t have an explicit guarantee on being retried as the consumption of the response body may begin in different location. Generally, we do not begin reading the body until we’ve reached out deserialization logic, this happens outside of the context of our
HttpPipeline
, therefore being outside of the scope of theRequestRetryPolicy
/RetryPolicy
that would attempt to reprocess the request.Given this for the time being it would be best to retain your external try/catch block. Scenarios where a request being sent or the response headers being received is taking longer than expected would be handled by the SDK. The last read getting stuck would need to be caught externally.
I’ll be investigating solutions to this issue so that the SDK would be able to handle all three timeout scenarios safely.
@somanshreddy The http client timeouts are on a per-request basis, so they do not include retries. The api timeouts are per operation, so they do include retries.