RoaringBitmap: Invalid cache reset in FastRankRoaringBitmap
Describe the bug
Method .select(rec) might return confusing results when .rank(rec) is called before first element is added to the FastRankRoaringBitmap.
The problem is that first call to rank will create highToCumulatedCardinality with zero elements in method org.roaringbitmap.FastRankRoaringBitmap#preComputeCardinalities. But then any write call to the FastRankRoaringBitmap that sould normally purge the cache will not pass this condition:
private void resetCache() {
// Reset the cache on any write operation
if (highToCumulatedCardinality != null && highToCumulatedCardinality.length >= 1) {
// We tag the first bucket to indicate the cache is dismissed
cumulatedCardinalitiesCacheIsValid = false;
}
}
The fix is to remove highToCumulatedCardinality.length >= 1 but I fear this has some reasoning I don’t see.
To Reproduce
Test case:
@Test
void testCacheIsNotReset() {
FastRankRoaringBitmap rankRoaringBitmap = new FastRankRoaringBitmap();
assertEquals(0, rankRoaringBitmap.rank(3));
rankRoaringBitmap.add(3);
rankRoaringBitmap.add(5);
assertEquals(3, rankRoaringBitmap.select(0));
assertEquals(5, rankRoaringBitmap.select(1));
}
RoaringBitmap version: 0.9.10
Java version: JDK11
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (14 by maintainers)
Commits related to this issue
- Fix issues #498 #499 — committed to blacelle/RoaringBitmap by blacelle 3 years ago
- Fix issues #498 #499 (#500) * Fix issues #498 #499 * Fix style * Re-use variable — committed to RoaringBitmap/RoaringBitmap by blacelle 3 years ago
I could make a PR but I’d like to know first whether the
highToCumulatedCardinality.length >= 1doesn’t have some reason which I don’t see. The fix seems really straightforward.@lemire I don’t think that the
FastRankRoaringBitmapwas meant to be immutable since there is clear logic in write methods (add / remove) that resets the cache. And yes - I chose this implementation because in my use-case I’m going to callrankrepeatedly many times, so I wanted to use most performant implementation available.The workaround for the problem is also simple - I just don’t call rank on empty bitmap. This is the method that I use that mimics Java Arrays.binarySearch() upon RoaringBitmap implementation: