kubernetes: Umbrella issue: slow/timing-out unit tests
running make test KUBE_RACE=-race locally, several packages have very slow unit tests, and some timed-out entirely
The default per-package timeout for make test is 120 seconds (which is already much longer than I would expect).
The following packages had tests that ran longer than 30 seconds on my workstation. Running tests on CI machines regularly takes 2-3x as long. Anything longer than 60 seconds should be prioritized.
cluster-lifecycle
- k8s.io/kubernetes/cmd/kubeadm/app/cmd 86.056s (
TestRunRenewCommandslooks to be the slowest test) - https://github.com/kubernetes/kubernetes/pull/98664 - k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init 56.161s - https://github.com/kubernetes/kubernetes/pull/98664
- k8s.io/kubernetes/cmd/kubeadm/app/phases/certs 120.433s (timed out, TestCreatePKIAssetsWithSparseCerts, TestUsingExternalCA, TestCreateCertificateFilesMethods are the slowest) - https://github.com/kubernetes/kubernetes/pull/98664
- k8s.io/kubernetes/cmd/kubeadm/app/util/pkiutil 38.180s - https://github.com/kubernetes/kubernetes/pull/98682
- k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade 89.649s - https://github.com/kubernetes/kubernetes/pull/98664
apps
- k8s.io/kubernetes/pkg/controller/cronjob 61.696s - https://github.com/kubernetes/kubernetes/pull/98691
networking
- k8s.io/kubernetes/pkg/controller/endpointslice 35.619s - #98793
- k8s.io/kubernetes/pkg/controller/nodeipam 112.441s - https://github.com/kubernetes/kubernetes/pull/98756
node
- k8s.io/kubernetes/pkg/controller/nodelifecycle/scheduler 52.778s - https://github.com/kubernetes/kubernetes/issues/98495, https://github.com/kubernetes/kubernetes/pull/98595
storage
- k8s.io/kubernetes/pkg/controller/volume/persistentvolume 41.215s - https://github.com/kubernetes/kubernetes/pull/98792
- k8s.io/kubernetes/pkg/controller/volume/scheduling 109.515s - https://github.com/kubernetes/kubernetes/pull/98912
- k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler 120.434s (timed out) - https://github.com/kubernetes/kubernetes/issues/91834, https://github.com/kubernetes/kubernetes/pull/97955 or https://github.com/kubernetes/kubernetes/pull/98915, #99174
- k8s.io/kubernetes/pkg/volume/csi 62.449s - https://github.com/kubernetes/kubernetes/pull/98762
- k8s.io/kubernetes/pkg/volume/util/operationexecutor 56.046s - https://github.com/kubernetes/kubernetes/pull/98760
api-machinery
- k8s.io/kubernetes/vendor/k8s.io/apiextensions-apiserver/pkg/apiserver 109.590s - https://github.com/kubernetes/kubernetes/pull/98694
- k8s.io/kubernetes/vendor/k8s.io/apiextensions-apiserver/pkg/controller/openapi/v2 56.254s - https://github.com/kubernetes/kubernetes/pull/98694
- k8s.io/kubernetes/vendor/k8s.io/client-go/util/connrotation 74.794s - https://github.com/kubernetes/kubernetes/pull/98496#issue-comment-box
- k8s.io/kubernetes/pkg/controlplane. 240s, timeout
/sig cluster-lifecycle apps network node storage api-machinery /triage accepted /help
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 8
- Comments: 25 (25 by maintainers)
wow… I… didn’t remember that was excluding itself that way. Opened https://github.com/kubernetes/kubernetes/pull/99782 to fix or remove unit tests that don’t work in race mode.
k8s.io/kubernetes/pkg/controller/endpointslice 35s -> 13s #98793
k8s.io/kubernetes/pkg/controller/volume/persistentvolume 41s -> 19s #98792
k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler 120s (timeout) -> 40s #98915
k8s.io/kubernetes/pkg/volume/csi 62s -> 30s #98762 These test cases use global variables, change to running in parallel may panic. I shortened the waiting time. https://github.com/kubernetes/kubernetes/pull/98762#issuecomment-773754653
k8s.io/kubernetes/pkg/volume/util/operationexecutor 56s -> 15s #98760
I’m working on the ‘kubeadm cert’ package. @neolit123
k8s.io/kubernetes/cmd/kubeadm/app/phases/certs 120.433s (timed out) -> #98517
multi-minute-long unit tests are typically a symptom of one of the following:
we should at least look at these packages to see if those are the cause
if we’re down to 1-2 problematic packages, I think I’ll close this in favor of specific targeted issues. If you have pointers to >100 second unit test runs of packages, please open an issue for the package and tag the appropriate sig
UT running timeout in my PR test:
And I find another 3 slow test cases.
k8s.io/kubernetes/pkg/controller/volume/scheduling 109s -> 24s #98912