helm: The `helm pull` command does not remove temporarily index files from repository cache location
Helm pull uses a temporarily index file, which isn’t removed. As the cached index file is never reused, I guess it should be removed automatically by Helm.
Scenario:
CD tools, like ArgoCD, run numerous helm pull commands which causes the cache directory size to grow. Depending on your ephemeral storage constraints this results in pods being evicted as they exceed the storage limits.
Output of helm version:
$ helm version
version.BuildInfo{Version:"v3.7.1", GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4", GitTreeState:"clean", GoVersion:"go1.16.9"}
Output of kubectl version: n/a
Cloud Provider/Platform (AKS, GKE, Minikube etc.): n/a
Check repository cache location:
$ helm repo
...
--repository-cache string path to the file containing cached repository indexes (default "~/.cache/helm/repository")
...
Clean up cache folder and check it is empty:
$ ls -ah ~/.cache/helm/repository/
. ..
Run a repo update:
$ helm repo update
...
Update Complete. *Happy Helming!*
Check our cache dir, all seems as expected:
$ ls -ah ~/.cache/helm/repository/
. .. stable-charts.txt stable-index.yaml
Lets pull a chart (using bitnami/kafka here as example):
$ helm pull --version 14.9.1 --repo https://charts.bitnami.com/bitnami kafka
$ ls -ah
. .. kafka-14.9.1.tgz
Check our cache dir:
$ ls -ah ~/.cache/helm/repository/
. .. 'deZbVF1bUWI1H+i+PNkPYPlhuxo=-charts.txt' 'deZbVF1bUWI1H+i+PNkPYPlhuxo=-index.yaml' stable-charts.txt stable-index.yaml
Repeating the same helm pull command creates new cache files every time.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15 (4 by maintainers)
This question has been answered. The behavior cannot be changed as it would violate hip-0004. My recommendation is that you either script your cleanup or use an ephemeral volume for your
HELM_REPOSITORY_CACHE.I would love to have one of Helm’s maintainers provide their feedback here.
I was looking at this as it seemed a bit strange. I was thinking of attempting a fix by generating a temp name based on a hash of the repository url (rather than the existing random string). Additionally, if the file is less than 5 mins old (say), then skip downloading the index file and just use the existing cached version.
If these files are not used for any subsequent operations… leaving them around is just a waste of the disk space of every single user of helm and even more so in automated CI/CD situations where this can have monetary cost implications … if these truly are not used after the pull… the code needs to be improved so they get removed.
Edit: This is clearly not a support question as its behaviour has been documented and an extremely simple reproduction sequence documented, it also should not be allowed to become stale automatically, so someone should adjust the GitHub labels appropriately so this doesn’t get automatically purged.