kubernetes: kube-scheduler's performance looks bad

What happened: I have a 5000 node cluster, and 55000+ pods have already been running in this cluster. 25000 of them are managed by 1000 deployments. Then I did some test to know about the scheduler performance, I found that the qps was always 15+ pods/sec and scheduler only consume, at most, 2 of 32 cores. The qps will not increase when I do rolling update against more deployments, instead, there are lots of pods are pending. Also, I observed there are lots of trace log while doing rolling update.

I0509 09:33:38.051328       1 trace.go:76] Trace[1260452793]: "Scheduling test-namespace/test-deploy-7d6f9d6f97-294mh" (started: 2019-05-09 09:33:37.921340778 +0000 UTC m=+100851.375269575) (total time: 129.959068ms):
Trace[1260452793]: [2.552365ms] [2.552365ms] Computing predicates
Trace[1260452793]: [46.367012ms] [43.814647ms] Prioritizing
Trace[1260452793]: [129.932795ms] [83.565783ms] Selecting host
Trace[1260452793]: [129.959068ms] [26.273µs] END

What you expected to happen:

scheduler should consume more cpus.
qps should be a large value. (I don’t know, may be 100 pods/sec)

How to reproduce it (as minimally and precisely as possible): Doing rolling update concurrently.

Anything else we need to know?: All pod have 3 labels. Environment:

Kubernetes version (use kubectl version): v1.13.4
Cloud provider or hardware configuration: 32 core & 128 gb memory & 5000 node cluster
OS (e.g: cat /etc/os-release): v3.3.11
Kernel (e.g. uname -a): 4.18.8-1.el7.elrepo.x86_64
Install tools: kubeadm & ansible
Network plugin and version (if this is a network-related bug):
Others:

kubernetes: kube-scheduler's performance looks bad

About this issue

Most upvoted comments