coredns: plugin/cache: Optional Cache with Non-Blocking Adds, Admission Policy, and LFU Eviction

What would you like to be added

  • Option to use a cache with the following features:
    • Non-blocking adds to the cache (not making the requester of the non-cached item wait for the write lock to be acquired)
    • Key collision protection (ensure that a key collision will not return an answer to the wrong question)
    • Admission policy (ensure that single-request for an item does not push other items out of the cache)
    • LFU eviction instead of random (keep frequently and recently requested items in the cache)
    • TTL eviction option, ristretto ttlEvict, will evict expired items from the cache, when they expire (or when max serve stale is reached if enabled)
  • Future
    • Eventually: make the new cache the default option if it is an improvement for most

Why is this needed

  • Possibly related issues
    • #3388 - Restart due to OOM
  • The internal cache is hashing using FNV but is not saving anything in the cache to allow detection of collision on reads
    • It appears to me, that the cache can return an Answer for a non-matching Question
    • The Question is not stored at all in the cache and the key is a unit64 FNV hash, so it appears to be impossible to detect that asking the cache for microsoft.com is returning an answer that was retrieved for someevilhost.com
    • Returning Answers to the wrong Question a rather serious flaw to be allowed in a DNS server, but perhaps there are mitigating circumstances that I missed
  • Eviction in the internal cache is random and all new items are admitted
    • This is noted in the docs and is not unexpected, but it is not desired
    • Eviction of heavily accessed items should be avoided (LRU should be used instead of random for eviction)
    • Admission to the cache should not be automatic when the cache is full
      • Only items that are requested with some frequency should be considered for admission into the cache
      • Implementation of this is non-trivial
  • Writing to the internal cache makes the caller wait for the acquisition of the write lock
    • I may be mistaken, but it appears that neither .set nor .Add are called in a new Go Routine
    • The caller must wait for the write lock to be obtained for these functions to return
    • This effectively serializes all writes within a shard - This is primarily a problem when there are high numbers of unique records to be cached
    • This may be leading to the large number of concurrent requests and OOM conditions being reported by some
    • Acquiring the write lock to write a single item, then releasing it, is expensive, particularly if there are already other items that waiting to be added to the same shard
      • Other cache libraries return to the caller immediately and obtain the write lock to write a batch of records at once

Benchmarks

cd plugin/cache && go test -run='' -bench=. -benchmem ./... 2>/dev/null

Baseline - Before any changes

Using additional benchmarks added to master: https://github.com/huntharo/coredns/tree/cache-benchmark

BenchmarkBuiltinReads-8                           712930              1547 ns/op             566 B/op         14 allocs/op
BenchmarkBuiltinParallelReads-8                  2479280               487 ns/op             566 B/op         14 allocs/op
BenchmarkBuiltinBalanced-8                        262920              4256 ns/op             815 B/op         19 allocs/op
BenchmarkBuiltinInserts-8                         323592              3568 ns/op             706 B/op         16 allocs/op
BenchmarkBuiltinParallelInserts-8                 131554              7640 ns/op            2472 B/op         58 allocs/op
BenchmarkBuiltinParallelInsertsRead-8             193125             29294 ns/op            1255 B/op         30 allocs/op

Branch

  • Balanced Reads/Writes
    • (NumCPUs - 2) goroutines
    • Demonstrates the serializing nature of the write lock on the existing cache
    • With even a 2% write to read ratio the Ristretto cache outperforms the Builtin cache
    • The test is configured with a 20% write to read ratio by default
    • Builtin: 3,567 ns/op
    • Ristretto: 2,713 ns/op
  • Simple Reads
    • 1 goroutine
    • Demonstrates the trade off in keeping some statistics regarding item usage
    • When 100% of operations are reads the builtin cache is faster
    • Balanced and “Reads During Inserts” tests gives a better example of performance under mixed read/write conditions
    • Baseline: 1,784 ns/op
      • This is the same test with the Builtin cache on the master branch
      • Not entirely clear why the Builtin cache got faster with the changes
    • Builtin: 1,590 ns/op
    • Ristretto: 1,864 ns/op
  • Parallel Reads
    • (NumCPUs - 2) goroutines
    • Demonstrates the trade off in keeping some statistics regarding item usage
    • When 100% of operations are reads the builtin cache is faster
    • Balanced and “Reads During Inserts” tests gives a better example of performance under mixed read/write conditions
    • Builtin: 521 ns/op
    • Ristretto: 605 ns/op
  • Inserts
    • 1 goroutine
    • Demonstrates speed improvement due to batching of writes and the caller not waiting for the write lock to be acquired during an insert operation
    • Builtin: 3,715 ns/op
    • Ristretto: 2,585 ns/op
  • Parallel Inserts
    • (NumCPUs - 2) goroutines
    • Demonstrates decreased lock contention between parallel threads and requests on insert operations
    • Builtin: 7,684 ns/op
    • Ristretto: 3,829 ns/op
  • Parallel Reads During Inserts
    • (NumCPUs - 2) goroutines
    • Demonstrates the advantages of the cache admission policy
    • The read items take 5 seconds to repopulate if they are evicted
    • The Ristretto cache keeps the expensive / frequently requested items in the cache
    • The builtin cache evicts the expensive / frequently requested items
    • Builtin: 26,190 ns/op
    • Ristretto: 3,446 ns/op
go test -run='' -bench=. -benchmem ./... 2>/dev/null
goos: darwin
goarch: amd64
pkg: github.com/coredns/coredns/plugin/cache
BenchmarkBuiltinReads-8                           647811              1590 ns/op             598 B/op         15 allocs/op
BenchmarkRistrettoReads-8                         639974              1864 ns/op             616 B/op         15 allocs/op
BenchmarkBuiltinParallelReads-8                  2382621               521 ns/op             598 B/op         15 allocs/op
BenchmarkRistrettoParallelReads-8                2029929               605 ns/op             611 B/op         15 allocs/op
BenchmarkBuiltinBalanced-8                        285925              3567 ns/op             835 B/op         20 allocs/op
BenchmarkRistrettoBalanced-8                      387178              2713 ns/op             862 B/op         20 allocs/op
BenchmarkBuiltinInserts-8                         312468              3715 ns/op             743 B/op         17 allocs/op
BenchmarkRistrettoInserts-8                       459302              2585 ns/op             661 B/op         15 allocs/op
BenchmarkBuiltinParallelInserts-8                 130486              7684 ns/op            2543 B/op         60 allocs/op
BenchmarkRistrettoParallelInserts-8               307554              3829 ns/op            2242 B/op         53 allocs/op
BenchmarkBuiltinParallelInsertsRead-8             216180             26190 ns/op            1315 B/op         32 allocs/op
BenchmarkRistrettoParallelInsertsRead-8           336003              3446 ns/op            1284 B/op         30 allocs/op
PASS
ok      github.com/coredns/coredns/plugin/cache 49.860s

Test Case - Eviction of Frequently Requested Items

  • This test case shows that requesting always unique names will push frequently requested names out of the cache
  • Let the slow loop run and populate the names first
  • Start the fast loop and see that the names for the slow loop are evicted (slow requests start taking 1,900 ms again)
  • Add parameter ristretto to the cache {} block in corefile.local
    • The cache will actually be able to add the slow lookups into the cache while the fast loop is running
    • It may take 2-3 loops but you will see some of the slow items start responding very quickly and eventually all of them responding very quickly
coredns -conf corefile.local &
coredns -conf corefile.erratic &

./test-slow.sh

# Wait until the above slow names are populated (the loop will speed up)
# In another terminal:
./test-fast.sh

corefile.local

  • Listens on 1053
  • Forwards *.slow. requests to another instance
  • Replies locally to all other requests with erratic on 5 ms delay
.:1053 {
  errors
  log

  # Forward *.slow requests to another instance of CoreDNS configured to respond
  # slowly with NOERROR
  forward slow. 127.0.0.1:1054

  cache {
    success 1000 10000 10000
    denial 1000 10000 10000
  }

  # Only requests that are not-cached and not-forwarded will get here
  erratic {
    delay 1 5ms
  }

  reload
}

corefile.erratic

  • Listens on 1054
  • Responds in 1900 ms to any request with NOERROR and Answer
.:1054 {
  errors
  erratic {
    delay 1 1900ms
  }
  reload
}

test-fast.sh

#!/usr/bin/env bash

set -euo pipefail

i=1

while true; do
  # Run 8 digs in parallel then wait for them
  dig $((i)).foo.com. -p 1053 @127.0.0.1 &
  dig $((i+1)).foo.com. -p 1053 @127.0.0.1 &
  dig $((i+2)).foo.com. -p 1053 @127.0.0.1 &
  dig $((i+3)).foo.com. -p 1053 @127.0.0.1 &
  dig $((i+4)).foo.com. -p 1053 @127.0.0.1 &
  dig $((i+5)).foo.com. -p 1053 @127.0.0.1 &
  dig $((i+6)).foo.com. -p 1053 @127.0.0.1 &
  dig $((i+7)).foo.com. -p 1053 @127.0.0.1 &
  wait
  i=$((i+8))
done

test-slow.sh

#!/usr/bin/env bash

set -euo pipefail

while true; do
  for i in {1..10}; do
    dig $i.foo.slow. -p 1053 @127.0.0.1
  done
done

References on Caching in Go

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

The simple solution here is to make it an external plugin, and let it get some soak time. From there we can decide whether to make it built-in as an alternative or replace the current plugin.

On Wed, Mar 4, 2020 at 3:04 AM Miek Gieben notifications@github.com wrote:

I fully understand caddy and bits we use. dnstap is simple in itself and contained to a single plugin.

For something as important as caching which is used by other plugins as well I need to fully understand it. If it so big that it must be an external package that is basically a non-starter.

Caches that I’ve looked at before setting on doing it myself mostly didn’t perform or had other issues.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/coredns/coredns/issues/3695?email_source=notifications&email_token=ACIHRMZOMRHMU4HRK34LOI3RFYYVVA5CNFSM4K3Q4HF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENXLMTA#issuecomment-594458188, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIHRMZ6Y3AY4IVPCF5F7N3RFYYVVANCNFSM4K3Q4HFQ .

Hi @miekg -

If we agree to replace the cache plugin we should just do it. See in #3696 to this just defaults to using some external package, which is an approach I don’t like, either we see/vet/review the entire cache code or don’t merge.

I see other external packages that could change / go away / have bugs that are not imported into the project, such as:

  • github.com/caddyserver/caddy
  • github.com/dnstap/golang-dnstap

My approach to de-risking the upstream dependency was to not get rid of the internal cache package so that the ristretto cache option can simply be no-op’d out if a problem is discovered with the cache or the project disappeared.

I also don’t believe in “everything is shit, lets replace”, too often to original PR author just quits and we’re left with code noone understands.

I don’t agree with the approach of throwing the existing solution out either. I like to ask people to fix the problem that they face and then determine if they need a more dramatic solution. My other issues/PRs are fixing problems with caching in general (without replacing the storage) and making inserts when using the builtin cache faster. I’m not advocating for using only a new cache backend, I am rather advocating for making all forms of caching better even if that makes the new cache backend less of an improvement than desired.

In fact, that is why the new cache is a config option. The idea is to let people use it if their use case benefits from it while not making coredns dependent on the existence and correctness of the ristretto cache project.

I rather like the idea of leaving the internal cache implementation in place as it provides a clean fallback option if ever needed AND it is faster at read-only scenarios. Additionally, I want the builtin cache to remain the default for the foreseeable future.

As such: this needs to be spoon fed in easy to understand PRs and include the complete cache implementation

I currently see a way to split it into 3 PRs:

  1. adding performance benchmarks and adding option to cover multiple storage backends
    1. covered here: https://github.com/huntharo/coredns/tree/cache-benchmark
    2. I can submit a PR to add these tests now if desired
  2. adding the module interface to allow having different storage backends
  3. the introduction of the new storage backend

I’m trying to keep all of my other changes distinct from this PR so as not to complicate the review of adding this new cache storage.

It looks like there are several issues with the current cache described. I can see the motivation.

If this cache library is generally better, I would prefer to simply replace the existing cache implementation rather than add a bunch of optional plugins.

But I would like to see some better benchmarks and validation to demonstrate the improvement.