RoaringBitmap: Invalid cache reset in FastRankRoaringBitmap

Describe the bug

Method .select(rec) might return confusing results when .rank(rec) is called before first element is added to the FastRankRoaringBitmap.

The problem is that first call to rank will create highToCumulatedCardinality with zero elements in method org.roaringbitmap.FastRankRoaringBitmap#preComputeCardinalities. But then any write call to the FastRankRoaringBitmap that sould normally purge the cache will not pass this condition:

  private void resetCache() {
    // Reset the cache on any write operation
    if (highToCumulatedCardinality != null && highToCumulatedCardinality.length >= 1) {
      // We tag the first bucket to indicate the cache is dismissed
      cumulatedCardinalitiesCacheIsValid = false;
    }
  }

The fix is to remove highToCumulatedCardinality.length >= 1 but I fear this has some reasoning I don’t see.

To Reproduce

Test case:

@Test
void testCacheIsNotReset() {
	FastRankRoaringBitmap rankRoaringBitmap = new FastRankRoaringBitmap();
	assertEquals(0, rankRoaringBitmap.rank(3));
	rankRoaringBitmap.add(3);
	rankRoaringBitmap.add(5);
	assertEquals(3, rankRoaringBitmap.select(0));
	assertEquals(5, rankRoaringBitmap.select(1));
}

RoaringBitmap version: 0.9.10

Java version: JDK11

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (14 by maintainers)

Commits related to this issue

Most upvoted comments

I could make a PR but I’d like to know first whether the highToCumulatedCardinality.length >= 1 doesn’t have some reason which I don’t see. The fix seems really straightforward.

@lemire I don’t think that the FastRankRoaringBitmap was meant to be immutable since there is clear logic in write methods (add / remove) that resets the cache. And yes - I chose this implementation because in my use-case I’m going to call rank repeatedly many times, so I wanted to use most performant implementation available.

The workaround for the problem is also simple - I just don’t call rank on empty bitmap. This is the method that I use that mimics Java Arrays.binarySearch() upon RoaringBitmap implementation:

/**
 *
 * @param roaringBitmap
 * @param recordId
 * @return
 */
static int indexOf(ImmutableBitmapDataProvider roaringBitmap, int recordId) {
	// this row prevents https://github.com/RoaringBitmap/RoaringBitmap/issues/498 ocurrence
	if (roaringBitmap.isEmpty()) {
		return -1;
	}
	final int rank = roaringBitmap.rank(recordId);
	final int index = rank - 1;
	final int nextRecordId = index >= 0 ? roaringBitmap.select(index) : -1;
	return nextRecordId == recordId ? index : -1 * (rank + 1);
}