cosmos-sdk: CacheKV has no size constraint which would cause memory leak
Summary of Bug
We have an archive node, it’s memory usage is constantly rising and never fall. The memory rises a lot especially when we make rpc calls to it for history state.
we made heap profile and looked into the code and found there’s a cachekv layer which hold the pointer to the underlying iavl node. the cachekv uses a map to hold the pointer and the issue here is there’s no other constraints on the map to limit how many iavl nodes it can hold.
For a pruned node, iavl node keeps being pruned as the block number grows, while for an archive node, the iavl nodes are never pruned, so as the block grows, the iavl nodes loaded into memory(loaded by block sync, rpc calls etc…) will stay there permanently, they won’t be GCed because cachekv also holds pointers to them permanently.
Version
v0.8.2
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 19 (4 by maintainers)
Yes, the cache is cleared on
Write()(https://github.com/cosmos/cosmos-sdk/blob/main/store/cachekv/store.go#L141-L153)normally, the CacheKV should be resetted at commit event?
Thanks for opening the issue. We will look into this.
Cc @alexanderbez