volcano: PodGroup isn't triggering scaling up in Kubernetes, when using Cluster Autoscaler
What happened: PodGroup which isn’t fitting to the current resource capacity of Kubernetes won’t trigger scale-up for node pool, even if Cluster Autoscaler is enabled in Kubernetes. What you expected to happen: I except that Kubernetes Cluster Autoscaler will detect the increased workload, and then trigger the scaling up. How to reproduce it (as minimally and precisely as possible): Apply a PodGroup which isn’t fitting the current resource capacity of node pool.
Anything else we need to know?: Microsoft has decided to use Volcano in their Azure ML platform for scheduling training jobs in Kubernetes, so this will an issue with lot of people in the future. Environment:
- Kubernetes version (use
kubectl version
): v1.22.15
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 6
- Comments: 24 (10 by maintainers)
Can the PR solve the current problem? #2602