k8s-config-connector: cnrm-resource-stats-recorder OOMing
Describe the bug cnrm-resource-stats-recorder is crashlooping due to OOM on 1 of our GKE clusters. We are using the workload identity based setup and are not using the GKE addon yet as it is still in beta.
The memory limit was set to 64 Mi in the version of CC we have deployed, I tried bumping it to 256 Mi and it seems to be getting stuck in the listing all CRDs part of the code.
Recording the stats of Config Connector resources
2020/07/13 11:32:01 Registering OpenCensus views
2020/07/13 11:32:01 Registering the Prometheus exporter
2020/07/13 11:32:01 Recording the process start time
I0713 11:32:03.965616 1 request.go:621] Throttling request took 1.070573036s, request: GET:https://10.12.16.1:443/apis/container.cnrm.cloud.google.com/v1beta1?timeout=32s
2020/07/13 11:32:05 recording the build info
2020/07/13 11:32:05 listing all CRDs managed by Config Connector
2020/07/13 11:32:50 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:33:43 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:34:12 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:34:42 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:35:08 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:35:20 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:35:46 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:36:26 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:36:41 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:36:51 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:37:18 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:37:45 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:38:15 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:38:44 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:39:02 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:39:24 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:39:46 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:40:24 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:40:35 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:41:02 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:41:51 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:42:00 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:42:11 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:42:51 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:43:07 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:43:34 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:44:08 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:44:26 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:44:49 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:45:23 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:45:36 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:45:50 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:46:24 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:46:57 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:47:22 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:47:31 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
2020/07/13 11:47:56 http: superfluous response.WriteHeader call from cnrm.googlesource.com/cnrm/vendor/github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:344)
The above just continued to be the only output for at least 30 minutes, I haven’t checked for longer yet.
We were not on the latest version of CCas mentioned below, so I diffed the install YAML for our version and the latest version and could see the container image version had been bumped (from 97b6128 -> e032470), so I tried changing it to that (I didn’t update the annotations for the CC version if that might have some impact) and it didn’t help.
One thing that would be helpful for these sort of problems is having access to the source for cnrm-resource-stats-recorder (and the other parts of CC) but as far as I can tell this isn’t publicly available? If I am wrong, please can you point me to the repos.
In another cluster of ours, the logs show the listing CRD step as completing within about 30 seconds.
ConfigConnector Version
kubectl get ns cnrm-system -o jsonpath='{.metadata.annotations.cnrm\.cloud\.google\.com/version}'
1.9.1%
I am happy to try an upgrade, especially if I can get an answer on #238 and no that it can be done with no impact on a running cluster.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 19 (9 by maintainers)
Hi @snuggie12 note: we are doubling the CPU for the stats-recorder pod in this week’s release as in our testing that has prevented the issue.