kubernetes: [flaky] SchedulingThroughput - SchedulingThroughput error: scheduler throughput: actual throughput 81.200000 lower than threshold 90.000000]

Which jobs are flaking:

ci-kubernetes-e2e-gci-gce-scalability
ci-kubernetes-e2e-gci-gce-scalability-networkpolicies
pull-kubernetes-e2e-gce-100-performance

Which test(s) are flaking:

testing/density/config.yaml

SchedulingThroughput error: scheduler throughput: actual throughput 81.800000 lower than threshold 90.000000]

Testgrid link:

Anything else we need to know:

Looking just at periodic jobs (not presubmit), the number of failures has been steadily rising since May:

/sig scalability /sig scheduling

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 35 (35 by maintainers)

Most upvoted comments

With https://github.com/kubernetes/test-infra/pull/18464 and https://github.com/kubernetes/test-infra/pull/18463 this should now be fixed.

wojtek-t on Jul 27, 2020

Given the throughput is computed by the test, it’s also possible that the cpu-starved test would result in lower numbers than they actually are.

Speaking of that, I just noticed today that the job doesn’t request CPU for the test pod, so it can easily be starved. We noticed it was getting assigned the default request of 250m in https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/93307/pull-kubernetes-e2e-gce-100-performance/1285712793332355072/ (see podinfo.json in the artifacts).

Shouldn’t the perf test be requesting a specific amount of CPU for its test pod (and maybe even limiting to that amount to make results over time comparable)?

liggitt on Jul 22, 2020

Do we know if there are any performance regressions on the api server side related to updates?

Not that I’m aware of. If I’m reading http://perf-dash.k8s.io/#/?jobname=gce-100Nodes-master&metriccategoryname=APIServer&metricname=DensityResponsiveness_PrometheusSimple&Resource=pods&Scope=namespace&Subresource=binding&Verb=POST correctly, the performance of posts to pods/binding looks pretty stable over time

liggitt on Jul 21, 2020