volcano: Fair sharing not working
What happened: My cluster has total 11 CPU. I’m trying to create 2 queue(excluding default queue) with weight 5 for each queue. Queue manifest,
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
name: test
spec:
weight: 5
---
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
name: test1
spec:
weight: 5
Queue List,
Name Weight State Inqueue Pending Running Unknown
default 1 Open 0 0 0 0
test 5 Open 0 0 0 0
test1 5 Open 0 0 0 0
Created 3 Jobs for test queue with CPU resource as follow, job1 -> CPU 5 job2 -> CPU 5 job3 -> CPU 1
Now all 3 jobs are running and utilizing full cluster.
Now i’m creating new Job in test1 queue with CPU 2. I’m expecting 1 Job will be evicted from test queue and Job in test1 queue will be running. But Job in test1 queue is in Inqueue state.
Name Weight State Inqueue Pending Running Unknown
default 1 Open 0 0 0 0
test 5 Open 0 0 3 0
test1 5 Open 1 0 0 0
Configuration,
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- plugins:
- name: drf
- name: predicates
- name: proportion
- name: nodeorder
- name: binpack
What you expected to happen: I’m expecting 1 Job will be evicted from test queue and Job in test1 queue will be running. But Job in test1 queue is in Inqueue state. How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
- Volcano Version: v1.3.0
- Kubernetes version (use
kubectl version
): - Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a
): - Install tools:
- Others:
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 25 (20 by maintainers)
l will take a look for that, my intuition is that there are still some potential bugs in proportion plugin…
reclaim works when multiple conditions met the requirement: you can check it from
AddReclaimableFn
preemptable := job.MinAvailable == 0 || job.MinAvailable <= job.ReadyTaskNum()-1