k3s: Unable to restart HA cluster after etcd size limit reached
Please see logs and additional context in https://github.com/k3s-io/k3s/issues/4787#issuecomment-1071901452
__Originally posted by @ssmall in https://github.com/k3s-io/k3s/issues/4787#issuecomment-1077880193__
To summarize the history and remaining issue, since the previous issue seems to have gotten conflated with a couple different problems:
-
I have a 3-master, 4-worker HA k3s cluster that went down and began failing to start up on 12 Feb with the error
panic: etcdserver: mvcc: database space exceeded(https://github.com/k3s-io/k3s/issues/4787#issuecomment-1037652836) -
Following the advice of @brandond I added
--etcd-arg=quota-backend-bytes=$((8*1024*1024*1024))to the startup args for my master nodes (https://github.com/k3s-io/k3s/issues/4787#issuecomment-1039493992) -
I also tried pointing a stand-alone etcd at the db directory, however that did not resolve the issue with k3s startup (https://github.com/k3s-io/k3s/issues/4787#issuecomment-1066218285)
-
Tried a couple fixes provided by @brandond and got a bit farther, however now as soon as I bring up the second master node and the two nodes start talking to each other, it’s back to the original error (https://github.com/k3s-io/k3s/issues/4787#issuecomment-1071083112)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 27 (12 by maintainers)
Aha! Adding the
--secrets-encryptionflag back was indeed the final missing piece. The cluster appears to be restored to working order now. Thanks for all your patience and responsiveness. It is great to have everything back as it was.