caffeine: Refresh as a non-blocking write
Currently refreshAfterWrite
and LoadingCache.refresh(key)
are performed as blocking writes (computations). This was a simple to implement approach that does not obstruct reads, avoids duplicate in-flight calls, and postpones write expiration. However it has a few drawbacks that a fully non-blocking version would not,
- Blocks other writers to that entry (explicit or eviction).
- May also block independent writes due to
ConcurrentHashMap
locking on thebin
level
- May also block independent writes due to
- Requires an explicit executor task, hinting that the reload should use a same-thread executor
- May waste a thread if
asyncReload(key, executor)
does not use the provided executor
- May waste a thread if
A fully asynchronous refresh must handle conflicts and notify the removal listener appropriately. A conflict occurs when the refresh completes and the mapping has changed (different or absent value). This is detectable as a reload
computes a new value based on the key and old value. The refresh cannot know whether the change is benign or whether it would insert stale data.
In Guava the refresh clobbers or inserts the entry, but due to races dropping is the only safe default choice. That default is potentially wasteful, so the user should have the option to resolve the conflict instead. This can be done by adding a new default method to AsyncCacheLoader
,
// Placeholder name, suggestions welcome!
default boolean retainOnRefreshConflict(K key, V currentValue, V oldValue, V newValue) {
return false;
}
The removal listener must be notified to avoid resource leaks. In Guava the removal cause is replaced
, but that is intended as a user actions rather than an automatic one. As we need to signal when a reload replaced or dropped the value, to avoid confusion a refreshed
should be added to the enum.
The post-processing of a reload will be implemented by attaching a whenComplete
to the future returned by asyncReload
. This avoids the unnecessary thread case described above for AsyncLoadingCache
. It is worth noting that cancelling the refreshing future if the entry is removed was rejected. This is because unlike Future
, in CompletableFuture
the task is not interrupted to hint that it may stop and the removal listener would still need to be notified if the value materialized.
The final detail is whether an in-flight refresh should postpone expiration. This is currently the case for expireAfterWrite
(but not expireAfterAccess
) due to reusing the entry’s writeTime
to avoid an extra field to indicate that a refresh was triggered. A stricter expireAfterWrite
would require an additional per-entry field and is more prone to a redundant load. For now we’ll keep it as is, but be willing to change if requested.
The above is from brainstorming with Etienne Houle @ Stingray (@ehoule).
@abatkin, @yrfselrahc, @lowasser might have an opinions on this design
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 1
- Comments: 31 (18 by maintainers)
Commits related to this issue
- Refresh as a non-blocking write (fixes #56, fixes #69) Previously refresh was implemented as a computation performed on the executor. Like all writes it allowed concurrent reads, but blocked other wr... — committed to ben-manes/caffeine by ben-manes 8 years ago
- Refresh as a non-blocking write (fixes #56, fixes #69) Previously refresh was implemented as a computation performed on the executor. Like all writes it allowed concurrent reads, but blocked other wr... — committed to ben-manes/caffeine by ben-manes 8 years ago
- Refresh as a non-blocking write (fixes #56, fixes #69) Previously refresh was implemented as a computation performed on the executor. Like all writes it allowed concurrent reads, but blocked other wr... — committed to ben-manes/caffeine by ben-manes 8 years ago
- Refresh as a non-blocking write (fixes #56, fixes #69) Previously refresh was implemented as a computation performed on the executor. Like all writes it allowed concurrent reads, but blocked other wr... — committed to ben-manes/caffeine by ben-manes 8 years ago
- Refresh as a non-blocking write (fixes #56, fixes #69) Previously refresh was implemented as a computation performed on the executor. Like all writes it allowed concurrent reads, but blocked other wr... — committed to ben-manes/caffeine by ben-manes 8 years ago
It occurred to me that this same race occurs on a
loadAll
for a synchronous cache. That is because there is an inherent race becausesynchronized
objects cannot be locked / unlocked in aggregate. This is not the case for an asynchronous cache where we insert incomplete futures, bulk load, and then populate those futures.LoadingCache#getAll(keys)
is documented as allowing implementors to race with the behavior of,That is slightly too narrowly worded because the last load will clobber an existing entry for any reason, including an explicit insert. This allows clobbering to insert stale data.
It seems better to generalize this enhancement to resolve conflicts for any type of load. That means the method would be called
retainOnConflict
, the addition removal causeconflict
would be added, andgetAll
documentation updated, andgetAll
updated to resolve conflicts. The default behavior would be pessimistic by dropping the newly loaded value.