autoscaler: Cluster autoscaler version 1.16.0 doesn't notice pending pods

We’ve been using cluster autoscaler version 1.15.0 patched with #2008 for some time in AWS, to good effect. Today we attempted to put the new version 1.16.0 into service using all the same configuration, and found that it no longer seems to notice pending pods.

The cluster autoscaler starts fine, and the logs don’t indicate anything failing. It goes through its periodic “main loop” and the “Regenerating instance to ASG map for ASGs” step regularly, again without any obvious problems. However, when we create pods that require that the cluster autoscaler take note and adjust a suitable ASG’s size, the cluster autoscaler’s logs don’t show any evidence of it noticing these pods. In prior versions, we see messages to the following effect:

Pod ns/name is unschedulable
Pod name be scheduled on node, predicate failed: PodFitsResources predicate mismatch, reason: Insufficient nvidia.com/gpu
Event(v1.ObjectReference{Kind:“Pod”, Namespace:“ns”, Name:“name”, UID:“ae32b9cc-081e-4323-ba33-7810457a0ddf”, APIVersion:“v1”, ResourceVersion:“58735432”, FieldPath:“”}): type: ‘Normal’ reason: ‘TriggeredScaleUp’ pod triggered scale-up: [{asg 0->13 (max: 125)}]

Instead, the new cluster autoscaler exhibits no reaction to these pending pods.

Here are the flag values reported at start time:

flags.go:52] FLAG: --add-dir-header="false"
flags.go:52] FLAG: --address=":8085"
flags.go:52] FLAG: --alsologtostderr="false"
flags.go:52] FLAG: --balance-similar-node-groups="false"
flags.go:52] FLAG: --cloud-config=""
flags.go:52] FLAG: --cloud-provider="aws"
flags.go:52] FLAG: --cloud-provider-gce-lb-src-cidrs="130.211.0.0/22,209.85.152.0/22,209.85.204.0/22,35.191.0.0/16"
flags.go:52] FLAG: --cluster-name=""
flags.go:52] FLAG: --cores-total="0:320000"
flags.go:52] FLAG: --estimator="binpacking"
flags.go:52] FLAG: --expander="least-waste"
flags.go:52] FLAG: --expendable-pods-priority-cutoff="-10"
flags.go:52] FLAG: --filter-out-schedulable-pods-uses-packing="true"
flags.go:52] FLAG: --gpu-total="[]"
flags.go:52] FLAG: --ignore-daemonsets-utilization="false"
flags.go:52] FLAG: --ignore-mirror-pods-utilization="false"
flags.go:52] FLAG: --ignore-taint="[]"
flags.go:52] FLAG: --kubeconfig=""
flags.go:52] FLAG: --kubernetes=""
flags.go:52] FLAG: --leader-elect="true"
flags.go:52] FLAG: --leader-elect-lease-duration="15s"
flags.go:52] FLAG: --leader-elect-renew-deadline="10s"
flags.go:52] FLAG: --leader-elect-resource-lock="endpoints"
flags.go:52] FLAG: --leader-elect-resource-name=""
flags.go:52] FLAG: --leader-elect-resource-namespace=""
flags.go:52] FLAG: --leader-elect-retry-period="2s"
flags.go:52] FLAG: --log-backtrace-at=":0"
flags.go:52] FLAG: --log-dir=""
flags.go:52] FLAG: --log-file=""
flags.go:52] FLAG: --log-file-max-size="1800"
flags.go:52] FLAG: --logtostderr="true"
flags.go:52] FLAG: --max-autoprovisioned-node-group-count="15"
flags.go:52] FLAG: --max-bulk-soft-taint-count="10"
flags.go:52] FLAG: --max-bulk-soft-taint-time="3s"
flags.go:52] FLAG: --max-empty-bulk-delete="10"
flags.go:52] FLAG: --max-failing-time="15m0s"
flags.go:52] FLAG: --max-graceful-termination-sec="600"
flags.go:52] FLAG: --max-inactivity="10m0s"
flags.go:52] FLAG: --max-node-provision-time="3m0s"
flags.go:52] FLAG: --max-nodes-total="0"
flags.go:52] FLAG: --max-total-unready-percentage="45"
flags.go:52] FLAG: --memory-total="0:6400000"
flags.go:52] FLAG: --min-replica-count="0"
flags.go:52] FLAG: --namespace="our-system"
flags.go:52] FLAG: --new-pod-scale-up-delay="0s"
flags.go:52] FLAG: --node-autoprovisioning-enabled="false"
flags.go:52] FLAG: --node-deletion-delay-timeout="2m0s"
flags.go:52] FLAG: --node-group-auto-discovery="[asg:tag=kubernetes.io/cluster-autoscaler/enabled,kubernetes.io/cluster/redacted]"
flags.go:52] FLAG: --nodes="[]"
flags.go:52] FLAG: --ok-total-unready-count="3"
flags.go:52] FLAG: --regional="false"
flags.go:52] FLAG: --scale-down-candidates-pool-min-count="50"
flags.go:52] FLAG: --scale-down-candidates-pool-ratio="0.1"
flags.go:52] FLAG: --scale-down-delay-after-add="3m0s"
flags.go:52] FLAG: --scale-down-delay-after-delete="0s"
flags.go:52] FLAG: --scale-down-delay-after-failure="3m0s"
flags.go:52] FLAG: --scale-down-enabled="true"
flags.go:52] FLAG: --scale-down-gpu-utilization-threshold="0.5"
flags.go:52] FLAG: --scale-down-non-empty-candidates-count="50"
flags.go:52] FLAG: --scale-down-unneeded-time="13m0s"
flags.go:52] FLAG: --scale-down-unready-time="7m0s"
flags.go:52] FLAG: --scale-down-utilization-threshold="0.5"
flags.go:52] FLAG: --scale-up-from-zero="true"
flags.go:52] FLAG: --scan-interval="10s"
flags.go:52] FLAG: --skip-headers="false"
flags.go:52] FLAG: --skip-log-headers="false"
flags.go:52] FLAG: --skip-nodes-with-local-storage="false"
flags.go:52] FLAG: --skip-nodes-with-system-pods="true"
flags.go:52] FLAG: --stderrthreshold="0"
flags.go:52] FLAG: --test.bench=""
flags.go:52] FLAG: --test.benchmem="false"
flags.go:52] FLAG: --test.benchtime="1s"
flags.go:52] FLAG: --test.blockprofile=""
flags.go:52] FLAG: --test.blockprofilerate="1"
flags.go:52] FLAG: --test.count="1"
flags.go:52] FLAG: --test.coverprofile=""
flags.go:52] FLAG: --test.cpu=""
flags.go:52] FLAG: --test.cpuprofile=""
flags.go:52] FLAG: --test.failfast="false"
flags.go:52] FLAG: --test.list=""
flags.go:52] FLAG: --test.memprofile=""
flags.go:52] FLAG: --test.memprofilerate="0"
flags.go:52] FLAG: --test.mutexprofile=""
flags.go:52] FLAG: --test.mutexprofilefraction="1"
flags.go:52] FLAG: --test.outputdir=""
flags.go:52] FLAG: --test.parallel="8"
flags.go:52] FLAG: --test.run=""
flags.go:52] FLAG: --test.short="false"
flags.go:52] FLAG: --test.testlogfile=""
flags.go:52] FLAG: --test.timeout="0s"
flags.go:52] FLAG: --test.trace=""
flags.go:52] FLAG: --test.v="false"
flags.go:52] FLAG: --unremovable-node-recheck-timeout="5m0s"
flags.go:52] FLAG: --v="4"
flags.go:52] FLAG: --vmodule=""
flags.go:52] FLAG: --write-status-configmap="true"
main.go:363] Cluster Autoscaler 1.16.0

I didn’t see any other issues complaining of this problem. Given that this release has been out for seven days now, I assume someone else would have run into this same problem.

Reverting to our previous patched container image worked fine, but we’d like to move forward. Is there some new configuration that we need to adjust in order to restore the previous behavior of the cluster autoscaler?

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 26 (14 by maintainers)

Most upvoted comments

Hello @seh @losipiuk , I am again facing issue with liveness probe while using image k8s.gcr.io/autoscaling/cluster-autoscaler:v1.16.7 Thanks for checking it again ,

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  22m                   default-scheduler  Successfully assigned aws/cluster-autoscaler-aws-cluster-autoscaler-8486df49c9-bs9bm to ip-10-0-175-211.eu-west-1.compute.internal
  Warning  Unhealthy  21m                   kubelet            Liveness probe failed: Get http://10.0.154.48:8085/health-check: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Normal   Started    20m (x4 over 22m)     kubelet            Started container aws-cluster-autoscaler
  Normal   Pulled     19m (x5 over 22m)     kubelet            Container image "k8s.gcr.io/autoscaling/cluster-autoscaler:v1.16.7" already present on machine
  Normal   Created    19m (x5 over 22m)     kubelet            Created container aws-cluster-autoscaler
  Warning  BackOff    2m17s (x92 over 21m)  kubelet            Back-off restarting failed container

Snehil03 on Apr 12, 2021