volcano: PodGroup isn't triggering scaling up in Kubernetes, when using Cluster Autoscaler

What happened: PodGroup which isn’t fitting to the current resource capacity of Kubernetes won’t trigger scale-up for node pool, even if Cluster Autoscaler is enabled in Kubernetes. What you expected to happen: I except that Kubernetes Cluster Autoscaler will detect the increased workload, and then trigger the scaling up. How to reproduce it (as minimally and precisely as possible): Apply a PodGroup which isn’t fitting the current resource capacity of node pool.

Anything else we need to know?: Microsoft has decided to use Volcano in their Azure ML platform for scheduling training jobs in Kubernetes, so this will an issue with lot of people in the future. Environment:

Kubernetes version (use kubectl version): v1.22.15

About this issue

Original URL
State: open
Created 2 years ago
Reactions: 6
Comments: 24 (10 by maintainers)

Most upvoted comments

Can the PR solve the current problem? #2602

wangyang0616 on Dec 16, 2022