azure-sdk-for-net: Event Hub Producer Client should expire transport producers after a period of inactivity
Summary
In order to publish events to a specific partition, the EventHubProducerClient will need to manage a pool of transport producers which are bound to a specific partition. These transport producers are created on-demand when a request is made for a specific partition. Once created, their lifespan is tied to that of the EventHubProducerClient that created them.
In general, this is not a concern, as the scenario in which a producer targets a specific partition is not a mainline scenario. As an efficiency, however, it would be desirable to expire those transport producers that have not been accessed in a given period of time (configurable) such that unused transport producers are disposed when possible, independent of the EventHubProducerClient.
Scope of Work
-
Implement a means of tracking the last time that a transport producer for a given partition was last used by the
EventHubProducerClientthat owns it. -
Implement a periodic check for the transport producers owned by an
EventHubClientinstance and dispose of those which have not been accessed for a given (configurable) period of time. -
Ensure that removing the existing transport producer is an atomic operation and does not cause corruption or unexpected results should there be a request to publish to that partition during the check/removal.
-
Ensure that removing the existing transport producer does not block creation of a new instance in the event of concurrent calls, nor cause the publishing of events for other partitions to be blocked. The producer client should be able to handle concurrent calls.
-
Tests have been added or adjusted to cover the new functionality.
Success Criteria
-
The refactoring and enhancements detailed by the scope have been completed.
-
The tests necessary for its validation have been created or adjusted and pass reliably.
-
The existing test suite continues to produce deterministic results and pass reliably.
References
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 25 (25 by maintainers)
Commits related to this issue
- [Event Hubs Client] PR amends on transport producer expiration (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Merged from upstream (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] fixed AZC0102 and AZC0106 (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] PR amends (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Reverted "fixed AZC0102 and AZC0106" (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Changed comments and variable names in EventHubProducerClient. (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Error handling of Timer.Dispose() (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Fixed tests not building (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Fixed tests not building (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Removed exception rethrown (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hubs Client] Removed exception rethrown (#8592) (#9431) — committed to albertodenatale/azure-sdk-for-net by albertodenatale 4 years ago
- [Event Hub Client] Expiring transport producers #8592 (#9431) The focus of these changes is to add a pool for transport producers used by the EventHubProducerClient which manages their lifespan using... — committed to Azure/azure-sdk-for-net by albertodenatale 4 years ago
Of course not. We can’t refine things if we don’t bounce them around. 😄
The
lockon every call concerns me. Though I don’t expect a partition-bound producer to be the norm, having it that way removes any benefit from the concurrent dictionary. When I had mentioned something two-phase, I was thinking along the lines of something like:Obviously, that is just a rough sketch off the top of my head. It certainly needs more refinement, but the idea is that we avoid taking the
lockpath unless we think there’s a pending removal. Since this isn’t critical resource management, we don’t have to be super-eager about cleaning, we can wait an extra tick or two.I’d still rank this as a fallback to the cache if that doesn’t work out.
Your turn for riffing on thoughts. 😄
Oh yeah, I’d definitely call that conclusive; you went right down to the metal and grabbed the transport producer directly. That’s about as low-level as we functionally can get. If we had a problem with opening multiple links to that same resource, there would be a hard error rejecting or force-closing one.
I’d like to spin some further thoughts that I think could simplify things further and help to unify the “what happens with a MemoryCache” and “What happens with a ConcurrentDictionary” scenarios.
Here’s the gist:
TransportProducerPoolreturns an producer, the caller provides a key that is logged as an active user of the item, and the “last used” time would be updated.That does come with a potential trade-off. In an exception situation, it is possible that a producer would be removed from the pool, but the last active user would not properly close it. This would leave the producer in limbo. However, the resources used are bound to an idle timeout. So, in the worst case, the producer would live until the idle timeout closed the underlying AMQP link. Since we own the code that would be responsible for managing the pooled producer and closing things, I think that’s a small exception case that I’m comfortable with.
I would absolutely not trust this code. It’s just to illustrate, and definitely requires thinking about more deeply than I’ve done yet.