autoscaler: emptyDir/Memory volume shouldn't prevent pod from eviction
Which component are you using?: cluster-autoscaler
What version of the component are you using?: Cannot find out. It’s on GKE 1.18.15-gke.1501
Component version:
What k8s version are you using (kubectl version)?:
kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"archive", BuildDate:"2021-03-19T21:36:49Z", GoVersion:"go1.16.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.15-gke.1501", GitCommit:"de67bb0d58413ba2ba9b64810ab438a9734a2ab9", GitTreeState:"clean", BuildDate:"2021-02-26T23:41:52Z", GoVersion:"go1.13.15b4", Compiler:"gc", Platform:"linux/amd64"}
What environment is this in?: GKE
What did you expect to happen?: cluster autoscaler to evict pods with emptyDir/Memory volumes
What happened instead?: pods are not evicted since emptyDir/Memory is considered as a local storage
How to reproduce it (as minimally and precisely as possible): Create extra workloads so at least 2 nodes are spawned. Create a deployment w/ pod having emptyDir volume with Memory medium (would be good if it spans on both nodes). Then destroy extra workloads and see nodes not downscaled and log entries with “no.scale.down.node.pod.has.local.storage” pointing to the deployment.
Anything else we need to know?:
There was similar request in 2019 - https://github.com/kubernetes/autoscaler/issues/2048 and it was closed with some recommended workaround - add an annotation to such kind of pods. It was noted that emptyDir/Memory volumes are considered as local storage intentionally, though the reason is not really clean. Let me share some thoughts on the topic:
- In terms of data persistency emptyDir/Memory has no difference with in-pod memory allocation. And Pod allocating memory is not marked as unsafe for eviction. So why to do it for emptyDir/Memory?
- From implementation point of view changes seems to be pretty simple and straightforward (unless I’ve missed something, yeah);
- Having extra annotation adds mainainance complexity. If pod is modified to include/exclude real local storage - the annotation has to be updated. Consider pods coming from 3rd party helm packages. All of them should be inspected and annotations provided where necessary just to get CA to work as intendent;
- Newcomers will see that CA just “doesn’t work” and it will take a time to figure out what’s going on. On another hand cloud providers should have extra docs describing this behaviour (e.g. https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscaler-visibility#cluster-not-scalingdown ). This just increases overall complexity.
Based on this I wonder if anything has changed since 2019 and emptyDir/Memory volumes handling can be updated to not prevent pod eviction?
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 28
- Comments: 28 (10 by maintainers)
We currently have the same problem. We use memory emptyDir for caching and it prevents pods from getting moved. In that case its completely save to do though.
/reopen
Just realized we were running into this.
emptyDiris considered ephemeral and regarded as temporary storage, nothing important. Just cache, log and stuff. Why autoscaler should care about this volume at first place? They’re specifically declared as ephemeral emptyDir from the app owners. If owner did care about persistence they’d use persistent storage like PVC anyways. It’d make sense however if we were using something like https://kubernetes.io/docs/concepts/storage/volumes/#hostpathI also would like to request a little bit specific example on this documentation https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node
Could someone define what kind of resources fall under local storage?