kubernetes: compact failed: etcdserver: mvcc: database space exceeded
What happened:
After running the e2e network suite with the test Should answer DNS query for maximum number of services per cluster creating and deleting 100K service, the etcd database exceeds the data quota, and all the apiserver operations fails because of that.
2020-08-15T19:53:24.101951921Z stderr F I0815 19:53:24.101375 1 httplog.go:89] “HTTP” verb=“PATCH” URI=“/api/v1/nodes/kind-worker2/status?timeout=10s” latency=“6.037477ms” userAgent=“kubelet/v1.20.0 (linux/amd64) kubernetes/529c134” srcIP=“172.18.0.3:54310” resp=500 statusStack=“\ngoroutine 690276 [running]:\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).recordStatus(0xc011c5c3f0, 0x1f4)\n\tstaging/src/k8s.io/apiserver/pkg/server/httplog/httplog.go:237 +0xcf\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).WriteHeader(0xc011c5c3f0, 0x1f4)\n\tstaging/src/k8s.io/apiserver/pkg/server/httplog/httplog.go:216 +0x35\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*baseTimeoutWriter).WriteHeader(0xc01612a960, 0x1f4)\n\tstaging/src/k8s.io/apiserver/pkg/server/filters/timeout.go:228 +0xb2\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/metrics.(*ResponseWriterDelegator).WriteHeader(0xc012bea480, 0x1f4)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/metrics/metrics.go:503 +0x45\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.(*deferredResponseWriter).Write(0xc0020c2af0, 0xc001fdcf00, 0x59, 0x60, 0x7f534a58d7a0, 0xc00d899d60, 0x41)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:202 +0x1f7\nk8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/protobuf.(*Serializer).doEncode(0xc000524140, 0x53257a0, 0xc00d899d60, 0x5314760, 0xc0020c2af0, 0x0, 0x4a0d23f)\n\tstaging/src/k8s.io/apimachinery/pkg/runtime/serializer/protobuf/protobuf.go:210 +0x5e5\nk8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/protobuf.(*Serializer).Encode(0xc000524140, 0x53257a0, 0xc00d899d60, 0x5314760, 0xc0020c2af0, 0x3b88579, 0x6)\n\tstaging/src/k8s.io/apimachinery/pkg/runtime/serializer/protobuf/protobuf.go:167 +0x147\nk8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).doEncode(0xc00d899e00, 0x53257a0, 0xc00d899d60, 0x5314760, 0xc0020c2af0, 0x0, 0x0)\n\tstaging/src/k8s.io/apimachinery/pkg/runtime/serializer/versioning/versioning.go:228 +0x396\nk8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).Encode(0xc00d899e00, 0x53257a0, 0xc00d899d60, 0x5314760, 0xc0020c2af0, 0xc000524140, 0x532e060)\n\tstaging/src/k8s.io/apimachinery/pkg/runtime/serializer/versioning/versioning.go:184 +0x170\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.SerializeObject(0x4a85389, 0x23, 0x7f534b0beda0, 0xc00d899e00, 0x5377b20, 0xc012ab4c88, 0xc01611bc00, 0x1f4, 0x53257a0, 0xc00d899d60)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:96 +0x12c\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.WriteObjectNegotiated(0x537aee0, 0xc000316480, 0x537b220, 0x74e7988, 0x0, 0x0, 0x4a0d23f, 0x2, 0x5377b20, 0xc012ab4c88, …)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:251 +0x572\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.ErrorNegotiated(0x5317900, 0xc01621b5e0, 0x537aee0, 0xc000316480, 0x0, 0x0, 0x4a0d23f, 0x2, 0x5377b20, 0xc012ab4c88, …)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:270 +0x16f\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.(*RequestScope).err(…)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/handlers/rest.go:89\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.PatchResource.func1(0x5377b20, 0xc012ab4c88, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/handlers/patch.go:225 +0x1fff\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints.restfulPatchResource.func1(0xc012bea3f0, 0xc011c5c4d0)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/installer.go:1193 +0x99\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/metrics.InstrumentRouteFunc.func1(0xc012bea3f0, 0xc011c5c4d0)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/metrics/metrics.go:384 +0x282\nk8s.io/kubernetes/vendor/github.com/emicklei/go-restful.(*Container).dispatch(0xc0015a0090, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tvendor/github.com/emicklei/go-restful/container.go:288 +0xa84\nk8s.io/kubernetes/vendor/github.com/emicklei/go-restful.(*Container).Dispatch(…)\n\tvendor/github.com/emicklei/go-restful/container.go:199\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.director.ServeHTTP(0x4a24690, 0xe, 0xc0015a0090, 0xc000bf8000, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/server/handler.go:146 +0x539\nk8s.io/kubernetes/vendor/k8s.io/kube-aggregator/pkg/apiserver.(*proxyHandler).ServeHTTP(0xc00156f630, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/kube-aggregator/pkg/apiserver/handler_proxy.go:121 +0x183\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux.(*pathHandler).ServeHTTP(0xc00a9161c0, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:248 +0x3db\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).ServeHTTP(0xc002db2e70, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:234 +0x8c\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.director.ServeHTTP(0x4a27b63, 0xf, 0xc004b3a630, 0xc002db2e70, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/server/handler.go:154 +0x74d\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAuthorization.func1(0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/filters/authorization.go:64 +0x563\nnet/http.HandlerFunc.ServeHTTP(0xc00515abc0, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tGOROOT/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithMaxInFlightLimit.func2(0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/server/filters/maxinflight.go:175 +0x4cf\nnet/http.HandlerFunc.ServeHTTP(0xc0071dc5a0, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tGOROOT/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithImpersonation.func1(0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/filters/impersonation.go:50 +0x203d\nnet/http.HandlerFunc.ServeHTTP(0xc00515ac00, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bc00)\n\tGOROOT/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAuthentication.func1(0x7f534b492a80, 0xc012ab4c78, 0xc01611bb00)\n\tstaging/src/k8s.io/apiserver/pkg/endpoints/filters/authentication.go:70 +0x672\nnet/http.HandlerFunc.ServeHTTP(0xc007619720, 0x7f534b492a80, 0xc012ab4c78, 0xc01611bb00)\n\tGOROOT/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1(0xc013207080, 0xc0072d8620, 0x537e3e0, 0xc012ab4c78, 0xc01611bb00)\n\tstaging/src/k8s.io/apiserver/pkg/server/filters/timeout.go:113 +0xb8\ncreated by k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP\n\tstaging/src/k8s.io/apiserver/pkg/server/filters/timeout.go:99 +0x1cc\n” addedInfo=“\nlogging error output: "k8s\x00\n\f\n\x02v1\x12\x06Status\x12A\n\x06\n\x00\x12\x00\x1a\x00\x12\aFailure\x1a)etcdserver: mvcc: database space exceeded\"\x000\xf4\x03\x1a\x00\"\x00"\n”
when the apiserver tries to run the periodic compaction of etcd to free space, it fails, so it’s not able to recover the situtation:
compact.go:124] etcd: endpoint ([https://127.0.0.1:2379]) compact failed: etcdserver: mvcc: database space exceeded
etcd keeps restarting
2020-08-15T19:51:55.915839764Z stderr F 2020-08-15 19:51:55.915605 W | etcdserver/api/etcdhttp: /health error due to an alarm memberID:5989846029734297070 alarm:NOSPACE 2020-08-15T19:52:33.618778603Z stderr F 2020-08-15 19:52:33.618609 N | pkg/osutil: received terminated signal, shutting down…
I think that is because of the healthcheck
Aug 15 19:41:47 kind-control-plane kubelet[709]: E0815 19:41:47.852175 709 event.go:264] Server rejected event ‘&v1.Event{TypeMeta:v1.TypeMeta{Kind:“”, APIVersion:“”}, ObjectMeta:v1.ObjectMeta{Name:“etcd-kind-control-plane.162b887017b19a8c”, GenerateName:“”, Namespace:“kube-system”, SelfLink:“”, UID:“”, ResourceVersion:“”, Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:“”, ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:“Pod”, Namespace:“kube-system”, Name:“etcd-kind-control-plane”, UID:“6758eea9919bd7866a6380c7b846e972”, APIVersion:“v1”, ResourceVersion:“”, FieldPath:“spec.containers{etcd}”}, Reason:“Unhealthy”, Message:“Liveness probe failed: HTTP probe failed with statuscode: 503”, Source:v1.EventSource{Component:“kubelet”, Host:“kind-control-plane”}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbfc62c7ef28e8c8c, ext:1693775391433, loc:(*time.Location)(0x6c94ae0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbfc62c7ef28e8c8c, ext:1693775391433, loc:(*time.Location)(0x6c94ae0)}}, Count:1, Type:“Warning”, EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:“”, Related:(*v1.ObjectReference)(nil), ReportingController:“”, ReportingInstance:“”}’: ‘etcdserver: mvcc: database space exceeded’ (will not retry!) Aug 15 19:41:50 kind-control-plane kubelet[709]: E0815 19:41:50.485083 709 controller.go:178] failed to update node lease, error: etcdserver: mvcc: database space exceeded Aug 15 19:41:50 kind-control-plane kubelet[709]: E0815 19:41:50.488343 709 controller.go:178] failed to update node lease, error: etcdserver: mvcc: database space exceeded Aug 15 19:41:50 kind-control-plane kubelet[709]: E0815 19:41:50.505142 709 controller.go:178] failed to update node lease, error: etcdserver: mvcc: database space exceeded
What you expected to happen:
The cluster should be able to recover, compact etcd and free the space necessary (if possible)
How to reproduce it (as minimally and precisely as possible):
It’s happening in most of the jobs here https://k8s-testgrid.appspot.com/sig-network-kind#sig-network-kind, master
The logs snippets in the description are from this occurrence https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-kind-network/1294711996654030849
apiserver logs etcd logs kubelet logs
Anything else we need to know?:
After compacting etcd manually you are able to restore the situation.
There are several things here:
- Is it possible that the health check don’t act on the alarm
alarm:NOSPACE, restarting etcd? Restarting etcd makes the situation worse and it may miss a peridoc apiserver compaction. - Can the apiserver process the error
etcdserver: mvcc: database space exceededand try a compaction, not waiting for the periodic compaction? - How the compact operations that should solve the problem fails with an error
database space exceeded?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 50 (41 by maintainers)
Commits related to this issue
- Disable Feature:PerformanceDNS Its testing could quickly break the cluster [1][2] due to etcdserver: mvcc: database space exceeded [1] https://github.com/kubernetes/test-infra/pull/18956 [2] https:/... — committed to collivier/CNTT by collivier 3 years ago
- Disable Feature:PerformanceDNS (#2580) Its testing could quickly break the cluster [1][2] due to etcdserver: mvcc: database space exceeded [1] https://github.com/kubernetes/test-infra/pull/18956 ... — committed to anuket-project/anuket-specifications by collivier 3 years ago
- Disable Feature:PerformanceDNS (#2580) Its testing could quickly break the cluster [1][2] due to etcdserver: mvcc: database space exceeded [1] https://github.com/kubernetes/test-infra/pull/18956 [2]... — committed to anuket-project/anuket-specifications by collivier 3 years ago
- Disable Feature:PerformanceDNS (#2580) Its testing could quickly break the cluster [1][2] due to etcdserver: mvcc: database space exceeded [1] https://github.com/kubernetes/test-infra/pull/18956 ... — committed to collivier/cntt-rc2 by collivier 3 years ago
- Disable Feature:PerformanceDNS (#2580) Its testing could quickly break the cluster [1][2] due to etcdserver: mvcc: database space exceeded [1] https://github.com/kubernetes/test-infra/pull/18956 [2]... — committed to collivier/cntt-rc2 by collivier 3 years ago
- Disable Feature:PerformanceDNS (#2580) Its testing could quickly break the cluster [1][2] due to etcdserver: mvcc: database space exceeded [1] https://github.com/kubernetes/test-infra/pull/18956 [2]... — committed to collivier/cntt-ra2 by collivier 3 years ago
- Disable Feature:PerformanceDNS (#2580) Its testing could quickly break the cluster [1][2] due to etcdserver: mvcc: database space exceeded [1] https://github.com/kubernetes/test-infra/pull/18956 ... — committed to anuket-project/RA2 by collivier 3 years ago
I agree that restarting etcd does not help in this case. And I suspect that, in general, it’s better to not restart etcd due to health check failures, but I don’t have the data to prove that, so I’m hesitant to suggest disabling the liveness probe. I’d be very interesting in data anyone is able to gather about this.
This might be best to discuss at etcd.io/etcd and not at kubernetes/kubernetes, but etcd compaction doesn’t actually reduce the storage file size, it just frees pages within it for reuse. defragmentation is the etcd operation that reduces the storage file size. My preference would be we teach etcd to perform defragmentation automatically rather than waiting for an administrator to request it. I think etcd is much better positioned than the apiserver to decide when to do defragmentation, so I’d rather invest in improving etcd.
See above about difference between compaction and defragmentation.
Unless dynamic service range configuration is supported, I am pretty sure it would be a breaking change resulting in out of range services for users uprading to a new KCM manifest, that did not specify an explicit range at cluster creation. The change may require special casing around upgrade.
OK - that explains it. Setting /12 for services sounds a lot. I doubt we would be even able to support 2^20 services (we don’t test more than O(10k), so roughly two orders of magnitude less.
@aojea - can you please rerun the experiment by setting say /16 or /18 for services cidr and grab the data then just to fully confirm it?