autoscaler: AWS: Failed to look for node info on ASG Launch Template
I have several ASG with current size 0. All of the ASG are created with Launch Template. I’m using gcr.io/google-containers/cluster-autoscaler:v1.3.7 with the command:
- ./cluster-autoscaler
- --cloud-provider=aws
- --namespace=kube-system
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,kubernetes.io/cluster/eks-tf-mi-playground-1-cluster-01
- --expander=least-waste
- --logtostderr=true
- --stderrthreshold=info
- --v=4
And somehow it is detecting them as Launch Configuration but it is not. How can I tell autoscaler to look for Launch Template on all ASG instead of Launch Configuration?
This are some logs:
I0306 13:16:06.427091 1 auto_scaling.go:48] Failed LaunchConfiguration info request for : ValidationError: 1 validation error detected: Value '[]' at 'launchConfigurationNames' failed to satisfy constraint: Member must satisfy constraint: [Member must have length less than or equal to 1600, Member must have length greater than or equal to 1, Member must satisfy regular expression pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\r\n\t]*]
status code: 400, request id: ffed3ce5-4011-11e9-a015-cb400eaf9394
E0306 13:16:06.427123 1 utils.go:280] Unable to build proper template node for eks-worker-tf-eks-01-m5-2xl-2: ValidationError: 1 validation error detected: Value '[]' at 'launchConfigurationNames' failed to satisfy constraint: Member must satisfy constraint: [Member must have length less than or equal to 1600, Member must have length greater than or equal to 1, Member must satisfy regular expression pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\r\n\t]*]
status code: 400, request id: ffed3ce5-4011-11e9-a015-cb400eaf9394
I0306 13:16:06.513070 1 auto_scaling_groups.go:316] Regenerating instance to ASG map for ASGs: [eks-worker-tf-eks-01-m5-2xl-0 eks-worker-tf-eks-01-m5-2xl-1 eks-worker-tf-eks-01-m5-2xl-2 eks-worker-tf-eks-01-m5-4xl-0 eks-worker-tf-eks-01-m5-4xl-1 eks-worker-tf-eks-01-m5-4xl-2 eks-worker-tf-eks-01-m5-l-0 eks-worker-tf-eks-01-m5-l-1 eks-worker-tf-eks-01-m5-l-2 eks-worker-tf-eks-01-p2-xl-0 eks-worker-tf-eks-01-p2-xl-1 eks-worker-tf-eks-01-p2-xl-2]
I0306 13:16:06.626631 1 aws_manager.go:148] Refreshed ASG list, next refresh after 2019-03-06 13:16:16.626624752 +0000 UTC m=+715.550545349
I0306 13:16:06.626757 1 utils.go:541] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
I0306 13:16:06.626771 1 static_autoscaler.go:260] Filtering out schedulables
I0306 13:16:06.627040 1 static_autoscaler.go:270] No schedulable pods
I0306 13:16:06.627055 1 scale_up.go:249] Pod ml-dev/fuad-neural-training-job-pznxl is unschedulable
I0306 13:16:06.627059 1 scale_up.go:249] Pod ml-dev/ali-changeme-0 is unschedulable
E0306 13:16:06.627091 1 static_autoscaler.go:293] Failed to scale up: Could not compute total resources: No node info for: eks-worker-tf-eks-01-m5-2xl-0
I0306 13:16:08.094772 1 leaderelection.go:209] successfully renewed lease kube-system/cluster-autoscaler
I0306 13:16:10.104128 1 leaderelection.go:209] successfully renewed lease kube-system/cluster-autoscaler
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 7
- Comments: 26 (14 by maintainers)
I’m running into this issue in one cluster with an ig with 0 nodes, set it to 1 instances minimum as a workaround. What’s the progress on a more permanent solution here?
(k8s 1.12 with cluster-autoscaler 1.12.6)