helm-controller: High Memory Usage after helm-controller v0.12.0 upgrade

I updated to helm-controller v0.12.1 and started using ReconcileStrategy Revision for all my local helm charts. Now helm-controller is restarted each time I push a commit to the GitRepository source, because helm-controller uses too much memory and is killed by kubernetes (OOMKilled). As a result of the controller being killed by kubernetes, some helm release are stuck in the upgrade process which must be manually rolled back (https://github.com/helm/helm/issues/8987).

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 1
Comments: 39 (21 by maintainers)

Commits related to this issue

Update Helm to v3.7.2 This commit updates Helm to 3.7.2, in an attempt to get to a v3.7.x release range _without_ any memory issues (see #345), which should have been addressed in this release. The ... — committed to fluxcd/helm-controller by hiddeco 3 years ago
Update Helm to v3.7.2 This commit updates Helm to 3.7.2, in an attempt to get to a v3.7.x release range _without_ any memory issues (see #345), which should have been addressed in this release. The ... — committed to fluxcd/helm-controller by hiddeco 3 years ago

Most upvoted comments

We’ve pushed a release candidate for #352, here is the image: ghcr.io/fluxcd/helm-controller:rc-4fe7a7c8

Please take it for a spin and let us know if it fixes the issue.

stefanprodan on Nov 3, 2021

I’ve spend the day digging around to find the root cause of the sudden increase in memory usage. Here is what I’ve found:

in this commit kubectl switched to k8s.io/kube-openapi (released in k8s.io/kubectl v0.22.1)
in this commit Helm switched to k8s.io/kubectl v0.22.1 (released in helm.sh/helm/v3 v3.7.0)
the k8s.io/kube-openapi pkg has a bug where it consumes large amounts of memory https://github.com/kubernetes/kubernetes/issues/101755
there is a fix underway https://github.com/kubernetes/kube-openapi/pull/251

We can’t do much in Flux, we have to wait for that PR to get merged, then wait for a Kubernetes release, then wait for a Helm release that uses the latest Kubernetes release and finally update Helm in Flux to fix the OOM issues.

I propose we revert Helm to v3.6.3 for a couple of months until the kube-openapi fixes end up in Helm.

stefanprodan on Nov 3, 2021

We’re seeing a 75% drop in helm-controller memory (peaks and base level) since picking up this version 👍

poblish on Nov 12, 2021

Helm released a patch yesterday which likely addresses this issue

Due to the holiday period that is arriving pretty soon however, I am hesitant in releasing this as I will be on leave for 3 weeks. Unless someone has specific needs for the v3.7.x release range, in which case I can provide a RC.

hiddeco on Dec 9, 2021

Thanks all for testing. This should now also be solved by updating the helm-controller Deployment image to v0.12.2.

CLI release for flux bootstrap, etc. will arrive later today.

hiddeco on Nov 11, 2021

Awesome, and thanks a lot for helping out. This seems to indicate that we can at least temporary work around the upstream problems by forcing the replacement of that specific Helm dependency, without having to stop receiving new Helm updates.

hiddeco on Nov 5, 2021

@stefanprodan I have deployed helm controller with the image provided, will monitor it for a couple of hours and look for restarts due to OOM Kill

flux: v0.21.0
helm-controller: rc-725fd784
image-automation-controller: v0.16.0
image-reflector-controller: v0.13.0
kustomize-controller: v0.16.0
notification-controller: v0.18.1
source-controller: v0.17.1

glen-uc on Nov 4, 2021

No, I think your observations are correct based on other reports on Slack.

Did a quick dive into it with the limited time I had available, but the helm-controller didn’t really change much besides Helm, kustomize and controller-runtime updates. It would be useful if someone could pinpoint the resource behavior change to an exact helm-controller version, which would help identifying the issue.

I am at present working on Helm improvements for the source-controller in the area of Helm repository index, dependency, and chart build memory consumption. Once that’s done, I have time (and am planning) to look in much greater detail at the current shape of the helm-controller (as part of https://github.com/fluxcd/helm-controller/milestone/1).

hiddeco on Nov 2, 2021