autoscaler: CA Scale Down Fails because of Daemonset utilization

Hello,

Is there a way to disregard daemonsets(or certain pods) when considering the utilization of a node to scale down?

I have tried adding the annotation: cluster-autoscaler.kubernetes.io/safe-to-evict: "true"

but the cluster autoscaler still seems to not kill nodes with high utilization due to daemonsets:

<node name> is not suitable for removal - utilization too big (0.631092)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 5
  • Comments: 17 (8 by maintainers)

Commits related to this issue

Most upvoted comments

I’ve raised https://github.com/kubernetes/autoscaler/pull/1407 which will add a flag to ignore DaemonSets when performing the resource utilization calculations. Wasn’t sure how to test it, but happy for pointers on what tests to add.

The proposal looks very reasonable to me. And easy to implement. With that said I am not sure if we will have time to address that. PRs are very welcome.

We are currently seeing this same issue. We have a large amount of nodes that only have daemonsets on them that are persisting in the cluster and are not being terminated by CA. These daemonsets provide metrics and system functionality, and are not user-scheduled pods.

The solution should either be a flag to ignore daemonset pods in the utilisation or an annotation that can placed on daemonset pods to ignore them.

So, would substracting utilization of daemonset-originating pods be a viable solution?

Seeing a similar thing. This largely affects our development staging clusters where scale-to-zero is an appealing way to make sure we at least provision the various MIGs that back the different flavors of node groups we use, while ensuring we don’t continuously run 16 node clusters with one userland pod.

With smaller nodes e.g. n1-standard-1, our logging and service mesh pods will bring utilization over 50%. These pods only exist to provide a common substrate and the node would not be otherwise be utilized or necessary if not for these DaemonSet pods.

I think in general, the heuristic mention by @WebSpider is a good one. Generally DaemonSet pods exist to provide this kind of baseline functionality for any node in the cluster and are typically not user-scheduled workloads.

That’s quite a backwards incompatible change. Perhaps a new flag to ignore daemonset-induced node utilization is the way to go? I can’t think of a case right now for the added complexity, but a more granular solution could be a pod annotation that explicitly tells the autoscaler to ignore utilization induced by a given pod.

Yes, I think that would solve our issue.