theine-go: Eviction not working as expected

First of all, awesome work! Very promising.

That being said, I may I found a bug in the following situation:

Step to reproduce

        var cacheSize int64 = 1000
	client, err := theine.NewBuilder[string, string](cacheSize).RemovalListener(func(key string, value string, reason theine.RemoveReason) {
		fmt.Println("removal", key, value, reason)
	}).Build()
	if err != nil {
		panic(err)
	}
	client.Set("test_key", "test_value", 500)
	fmt.Println(client.Get("test_key"))
	client.Set("test_2_key", "test_2_value", 500)
	fmt.Println(client.Get("test_2_key"))
	client.Set("test_3_key", "test_3_value", 600)
	fmt.Println(client.Get("test_3_key"))
	// wait for the eviction to proceed
	time.Sleep(time.Second)
	fmt.Println(client.Get("test_3_key"))

Expected

test_value true
test_2_value true
test_3_value true
removal test_key test_value 1
removal test_2_key test_2_value 1
test_3_value true

Actually returned

test_value true
test_2_value true
test_3_value true
removal test_3_key test_3_value 1
 false

This particular behavior seems to happens only when the sum of key’s cost equals cacheSize

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

I have an old spreadsheet of data when I was originally investigating eviction policies. That can serve as a baseline to ensure nothing seems amiss. I took W-TinyLFU (1%) which is non-adaptive.

Policy 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000
Caffeine_new 11.85 22.98 33.01 42.51 50.91 58.71 64.82 70.18
Caffeine_old 11.72 23.11 33.19 43.00 51.46 59.10 65.39 70.67
W-TinyLfu 12.14 23.26 33.51 42.79 51.12 58.92 65.05 70.22

From here the numbers are around 0.5%, slightly worse on average. I think that is close enough to be noise so I didn’t worry about it. I would say there is not a strong argument in favor or against the block-based filter, so I decided to keep it for my own enjoyment and to share a neat optimization trick.

For a weighted cache I adjust the sketch’s size at runtime. It hasn’t been a problem since the query is cheap and usually no-ops. It also protects against someone trying to make an unbounded cache from over allocating, e.g. if embedded and all the user has is a config value to set. It does cause a little skew in simulations so initialCapacity will pre-allocate the sketch for non-weighted traces.

btw, I rewrote the sketch to be optimize for the cpu’s cache line. It wasn’t necessary and was just for fun.

Favoring recency can be a disadvantage in workloads that perform scans such as database, search, or analytical queries. The > approach to restrict admission rather than >= to allow lets the cache be scan resistant in a loopy access pattern. A modestly sized cache will have enough space for an admission window that can capture recent arrivals (delaying the filtering) and the adaptive behavior lets the cache tune itself dynamically. We tend to favor optimizing for longer running, modestly sized caches instead of tiny caches or short sequences because those are usually easy to fix, less critical, or a strawman that doesn’t represent the application.

Caffeine’s policy choices were data driven by simulating a diverse set of workloads. When it under performed other eviction policies then we analyzed for the cause, experimented, and found solutions. Currently the policy is either the leader or competitive across all of these workloads, but more are very welcome. A problem we run into is that research papers will sometimes claim to have a substantially better hit rate, but the authors cannot provide the data to let us reproduce and understand what they observed. That makes it difficult to justify improvement ideas as it is probably more harmful to make well intentioned but blind changes.

If you could provide traces of your target workloads then we can see the actual performance and how that behavior compares to optimal and other leading policies.