caffeine: Unintuitive behavior (bug?) with `put()`
Upon migrating from Guava cache to Caffeine in bazel we are seeing some flakiness in tests that heavily exercise weak/soft-valued caches. It appears that the behavior of put()
for an already present key in Caffeine diverges from Guava cache (and ConcurrentHashMap
) in that a concurrent thread calling getIfPresent
can get null
, as opposed to either the old value or the new value.
Note that this strange behavior occurs even if the put
call does not change the existing value. I believe it’s the result of non-atomically clearing the existing value reference before setting it.
The following reproduces the issue within a few iterations:
Cache<String, String> cache = Caffeine.newBuilder().weakValues().build();
String key = "key";
String val = "val";
cache.put("key", "val");
AtomicReference<Exception> e = new AtomicReference<>();
List<Thread> threads = new ArrayList<>();
for (int i = 0; i < 10; i++) {
int name = i;
Thread t =
new Thread(
() -> {
for (int j = 0; j < 1000; j++) {
if (Math.random() < .5) {
cache.put(key, val);
} else if (cache.getIfPresent(key) == null) {
e.compareAndSet(
null,
new IllegalStateException(
"Thread " + name + " observed null on iteration " + j));
break;
}
}
});
threads.add(t);
t.start();
}
for (Thread t : threads) {
t.join();
}
if (e.get() != null) {
throw e.get();
}
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 19 (19 by maintainers)
Commits related to this issue
- Fix racy read of null weak/soft old value when writing a new value (fixes #568) — committed to ben-manes/caffeine by ben-manes 3 years ago
- Improve robustness in racy scenarios (fixes #568) 1. When an entry is updated then a concurrent reader should either observe the old or new value. This operation replaces the j.l.Reference instance s... — committed to ben-manes/caffeine by ben-manes 3 years ago
- Improve robustness in racy scenarios (fixes #568) 1. When an entry is updated then a concurrent reader should observe either the old or new value. This operation replaces the j.l.Reference instance s... — committed to ben-manes/caffeine by ben-manes 3 years ago
- Improve robustness in racy scenarios (fixes #568) 1. When an entry is updated then a concurrent reader should observe either the old or new value. This operation replaces the j.l.Reference instance s... — committed to ben-manes/caffeine by ben-manes 3 years ago
- Improve robustness in racy scenarios (fixes #568) 1. When an entry is updated then a concurrent reader should observe either the old or new value. This operation replaces the j.l.Reference instance s... — committed to ben-manes/caffeine by ben-manes 3 years ago
- Improve robustness in racy scenarios (fixes #568) 1. When an entry is updated then a concurrent reader should observe either the old or new value. This operation replaces the j.l.Reference instance s... — committed to ben-manes/caffeine by ben-manes 3 years ago
- Improve robustness in racy scenarios (fixes #568) 1. When an entry is updated then a concurrent reader should observe either the old or new value. This operation replaces the j.l.Reference instance s... — committed to ben-manes/caffeine by ben-manes 3 years ago
- Improve robustness in racy scenarios (fixes #568) 1. When an entry is updated then a concurrent reader should observe either the old or new value. This operation replaces the j.l.Reference instance s... — committed to ben-manes/caffeine by ben-manes 3 years ago
- Update //third_party/caffeine to v2.9.2 Contains critical fixes for ben-manes/caffeine#568. Closes #13651. Signed-off-by: Philipp Wollermann <philwo@google.com> — committed to bazelbuild/bazel by justinhorvitz 3 years ago
I think that the issue as originally described is not actually affecting our tests, I just discovered it by accident. Instead, my understanding is that we’re encountering a race between cleanup of an evicted value and a call to
put
to update that value.put
updates the node’s value reference to a live value.evictEntry
proceeds to obtain a lock on the node, but does not check whether the node was updated with a new value. Thus we proceed to clean up a live entry.With that understanding, I modified
evictEntry
with the following check and the issue disappeared, even without any of the other changes we discussed:Does this explanation and solution make sense to you?
I still think we should fix the originally reported issue as well.