azure-sdk-for-java: [core-amqp] onTransportError does not retry to get AMQP Connection
When AMQP Connection get Transport error, it does not retry to get new AMQP Connection. Issue observed in: core-amqp : 2.0.4
This error shows up both in ServiceBus and EventHubs.
Replicating the issue:
The Transport Error could come from many different reason, but one way I am trying to generate this is to start a Async Receiver and remove the internet connection for 5 minutes.
Here is what we will see in logs. com.azure.core.amqp.implementation.handler.ConnectionHandler - onTransportError hostname[eh-test-t2.servicebus.windows.net], connectionId[MF_ca8f0f_1619737351426], error[An existing connection was forcibly closed by the remote host]
com.azure.core.amqp.implementation.ReactorConnection - onConnectionShutdown connectionId[MF_ca8f0f_1619737351426], hostName[eh-test-t2.servicebus.windows.net], message[Shutting down], shutdown signal[false]
Commit point just before the AMQP Connection fixes in April: a8a39f8c2d216da5ee2e1c9f2a162d20cc97ab60
My observation is that code from above commit point treat this error as Transient and retry since it call this endpointsState/onError https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/core/azure-core-amqp/src/main/java/com/azure/core/amqp/implementation/AmqpChannelProcessor.java#L104
but latest core-amqp/2.0.4
. call endpointsState/onComplete consumer here https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/core/azure-core-amqp/src/main/java/com/azure/core/amqp/implementation/AmqpChannelProcessor.java#L107
Attached files : Intellij-eventhubs-disconnect-2minutes-event-consumer_core-amqp-version2.0.4.txt : Shows this issue in latest Events Hubs. core-amqp-version-before-amqp-connection-issuefix-wifi-considered-as-tranisient-error.txt : Shows older version of core-amqp treating it as Transient error
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 23 (10 by maintainers)
We’ve investigated the
onTransportError
, which was the original issue and validated that the low-level (sync and async) Event Hubs and Service Bus clients can recover from these errors. In addition, when the retry options are exhausted both ServiceBus and EventHub Low-Level Receiver (Sync and Async) propagate the retry exhausted exception; it is not getting swallowed and not leaving the receiver hanged.