kube-state-metrics: kube-state-metrics breaking release aka 2.0
We have accumulated a number of deprecated metrics and odd behaviors that I believe may justify a 2.0 release. I’d like to take this issue to discuss whether people think this is a good idea and collect what we would potentially like to break should we do a breaking release.
Off the top of my head breaking changes I would like to do:
- remove deprecated metrics https://github.com/kubernetes/kube-state-metrics/issues/975
- rename black-/whitelist to allow/deny-list https://github.com/kubernetes/kube-state-metrics/issues/975
- use same ports in all cases (currently the flag defaults to 80/81, but the dockerfile specifies 8080 and 8081) https://github.com/kubernetes/kube-state-metrics/issues/976
- rename hpa metrics to use full
horizontalpodautoscalernomenclature, to match the rest of the exposed metrics https://github.com/kubernetes/kube-state-metrics/issues/977 - rename
--namespaceflag to--namespaceshttps://github.com/kubernetes/kube-state-metrics/issues/978 - remove non-identifying labels from pod metrics (eg. node labels)
- ~consider renaming kube-state-metrics to kubernetes-exporter~ we decided against this one
-
kube_secret_metadata_resource_version,kube_configmap_metadata_resource_versionandkube_ingress_metadata_resource_versionexpose the resource version as a string in its set of labels. This value can change often and would therefore create huge cardinality. This should be a number or not existing at all. https://github.com/kubernetes/kube-state-metrics/pull/997 - rename use of “collector” in user facing things (flags, help text, etc.) to “resources”, as the collector type pattern does not exist anymore. https://github.com/kubernetes/kube-state-metrics/issues/980
- rename storage class labels reclaimPolicy to reclaim_policy and volumeBindingMode to volume_binding_mode. https://github.com/kubernetes/kube-state-metrics/pull/1107
- require
kube_*_labelsmetrics to have explicit lists of labels to expose passed in via a flag to prevent unnecessary high cardinality https://github.com/kubernetes/kube-state-metrics/issues/1047 - reworking resource metrics to try to adhere as best as possible to Prometheus “sums of series should make sense” rule. https://github.com/kubernetes/kube-state-metrics/pull/1168
I would see a breaking release at least 3 months out there, as I would like to validate the performance optimizations independently first. Further thoughts?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 48 (42 by maintainers)
We finished and release is cut 🎉 Thank you all!!
@bboreham you can also just blacklist those metrics for now until they are removed?
--metric-blacklist="kube_configmap_metadata_resource_version"for example should work. 😃Came here to +1 removal of high-cardinality
kube_configmap_metadata_resource_version. This one metric occupies 3% of all the data in our service.I’m intrigued: what does anyone use this metric for, in its current form?
I would add to the list, renaming all the leftover user-facing occurrences of
collectorstoresources, as we recently removed thecollectorspackage. That would also mean renaming collector in options toresource. Overall the--resources=podsflag would become more self-descriptive./remove-lifecycle stale
One more thing that came up during kubecon: Before we do the v2 release we probably want to do another round of scalability tests. I believe Google volunteered to do this.
Sharding isn’t breaking so I feel it can be added in a backward compatible way in 1.x or 2.x. It’s fairly close to being ready I would say though so I’d like to see it go into a 1.x release.
Sorry I should have been more clear about why I think we should change ports. Anything lower than 1024 requires root on Linux (or at least have the CAP_NET_ADMIN capability). That’s why default ports of kube-state-metrics should be higher than that. And beyond that, whatever we use should be consistent across the pure binary and the container.