k3s: Embedded etcd server does not account for exceeding database space
Describe the bug: When you ran k3s long enough with etcd store, you are probably going to see this:
Flag --insecure-port has been deprecated, This flag has no effect now and will be removed in v1.24.{"level":"warn","ts":"2021-12-19T17:34:48.509+0800","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc01a9ac000/#initially=[https://127.0.0.1:2379]","attempt":0,"error":"rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded"}
panic: etcdserver: mvcc: database space exceeded
goroutine 417 [running]:
github.com/rancher/k3s/pkg/cluster.(*Cluster).Start.func1(0xc0336094a0, 0x5959b78, 0xc000936900, 0xc00159cc80)
/go/src/github.com/rancher/k3s/pkg/cluster/cluster.go:103 +0x1e5created by github.com/rancher/k3s/pkg/cluster.(*Cluster).Start
/go/src/github.com/rancher/k3s/pkg/cluster/cluster.go:98 +0x6bf
Steps To Reproduce: Find a k3s server with large enough etcd store
Expected behavior: k3s should automatically compact etcd store and continue as usual; if not, start a emergency etcd server that allow the operators to do some rescue work.
Actual behavior: It just crash, so I can’t even do manual compaction myself
Backporting
- Needs backporting to older releases
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 45 (19 by maintainers)
Reproduced the issue in k3s with version v1.23.4+k3s1
Verified the fix in k3s with v1.23.5-rc1+k3s1
Please advice on this observation @brandond