zephyr: settings: using the settings system as a key-value store and retrieve system is slow

Is your enhancement proposal related to a problem? Please describe. I’m working on a project that writes a few settings on each boot. It turns out it causes performance degradation, namely, each boot takes about 70ms more than the previous one:

Reboot, Boot time
0,      2340 ms
1,      2467 ms
2,      2559 ms
3,      2668 ms
4,      2651 ms
5,      2770 ms
6,      2946 ms
7,      2910 ms
8,      3034 ms
9,      3047 ms
10,     3145 ms
11,     3182 ms
12,     3243 ms
13,     3336 ms
14,     3362 ms
15,     3470 ms
16,     3542 ms
17,     3626 ms
18,     3689 ms
19,     3735 ms
20,     3789 ms

Some application characteristics in the tested scenario:

  • Due to the nature of the solution - we don’t know all possible keys in advance - the application must use settings_save_one/settings_load_subtree_direct APIs rather than settings handlers.
  • Number of keys in the settings store is about 30.
  • Number of keys that the app is trying to read on boot is about 100.
  • Number of keys written on boot: 5 (small 4 byte counters).

I think the current implementation of settings over NVS is inefficient; it tries to leverage the existing NVS API that doesn’t allow for fast key lookup. Let me give you some statistics from the 20th boot of the application:

Number of settings loads: 98 * 32056us (average time)
Number of NVS reads (key lookup): 1692 * 943us (average time)
Number of NVS reads (value probe): 1692 * 836us (average time)
Number of value reads: 1563 * 74us (average time)
Number of NVS ATE reads: 735614

Thus 100 setting loads results in 735614 ATE reads. It’s the consequence of the current settings_load_subtree_direct implementation (pseudocode):

settings_load_subtree_direct(subtree, callback):
	max_key_id <- nvs_read(0x8000) // about 30 in our case
	
	for id in [0x8001, max_key_id]
		key <- nvs_read(id)
		value <- nvs_read(id + 0x4000) 

		if not key and not value then
			continue

		if not key or not value then
		    // delete orphaned key/value
		    nvs_delete(id)
		    nvs_delete(id + 0x4000)

		// call settings_loader which validates that 'key' matches 'subtree' and runs
		// another nvs_read(id + 0x4000) using the user's buffer for data.
		settings_loader(key, subtree, callback)

Important observations:

  • The load time linearly depends on the number of keys.
  • We read both key and value and even if the key doesn’t match the request.
  • Keys and values are stored in the same NVS and keys rarely change, so they will likely be located close to the end of the NVS history - we may expect a lot of ATE reads during the key lookup.

Describe the solution you’d like We’ve come up with a few ideas and not all of them are mutually exclusive (we can implement a few of them):

1. Don’t read the value if the key doesn’t match the request It’s an obvious and simple optimization - move the filtering by requested key/subtree as early as possible. The integrity check that compares if there are no keys without values or the other way around could be done on the settings initialization. We may expect about 50% load time reduction (not enough)

2. Cache/map from setting key to NVS ID We can maintain an array: [hash(key) % CACHE_SIZE] -> NVS ID to help quickly find the NVS ID that likely contains the requested key. Note that it’s just a hint, so if the NVS ID does not contain the desired key, we may fallback to the existing loop over all possible IDs. That way we can avoid implementing a complex cache solution with cache eviction and substantial RAM overhead - for 100 keys we need to devote just 200B of RAM. Note that a cache miss allows us to quickly conclude that a given key does not exist.

3. nvs_walk Instead of the cache we can also extend the NVS API by adding some nvs_walk(min_id, max_id, callback) function that would allow a user to walk through all ATEs for a given ID range and have a user-provided callback called for each entry. The callback could read data associated with a given ATE and compare it with the requested key. nvs_walk function could be used by settings_nvs backend to find NVS ID of a given key in a single pass.

4. Cache from NVS ID to the most recent ATE Rather than optimizing settings, we may think about optimizing NVS itself. Currently, reading an NVS value takes averagely linear time with respect to the number of ATEs. We may try to make it averagely constant by implementing a similar idea to 2, but at the NVS layer. Namely, we can maintain an array: [NVS ID % CACHE_SIZE] -> Most recent ATE position. Like before, the value stored in the cache can be used as a hint only to avoid the need for resolving cache collisions/overruns and the implementation could fallback to the linear lookup if the hint is incorrect.

Any other ideas?

Describe alternatives you’ve considered Change the current scheme for allocating NVS IDs for setting keys to use a hash function rather than “give me the first free ID” approach. That could help us achieve a direct hit instead of looping over all possible IDs. The solution, however, wouldn’t help to existing products with already written settings.

Additional context N/A

https://github.com/zephyrproject-rtos/zephyr/issues/45591#tasklist-block-0e93706c-875c-4875-afea-fd39e8782700

About this issue

Most upvoted comments

@Damian-Nordic, sorry for the close (hit the wrong button)…