kubernetes: Integration test suite failing

I am facing test case failures while running integration test cases using test-dockerized.sh with an added flag “-p 1” to disable parallelism. Following failures are observed on 1.18.4 branch

=== Failed
=== FAIL: test/integration/apiserver/apply  (0.00s)
I0629 02:44:47.759999   23295 etcd.go:81] etcd already running at http://127.0.0.1:2379
FAIL	k8s.io/kubernetes/test/integration/apiserver/apply	106.723s

=== FAIL: test/integration/auth  (0.00s)
I0629 02:47:31.644814   23520 etcd.go:81] etcd already running at http://127.0.0.1:2379
FAIL	k8s.io/kubernetes/test/integration/auth	73.154s

=== FAIL: test/integration/daemonset  (0.00s)
I0629 02:53:17.668164   23978 etcd.go:81] etcd already running at http://127.0.0.1:2379
FAIL	k8s.io/kubernetes/test/integration/daemonset	300.637s

=== FAIL: test/integration/master TestReconcilerMasterLeaseMultiCombined (20.49s)

=== FAIL: test/integration/replicaset  (0.00s)
I0629 03:13:51.149436   25639 etcd.go:81] etcd already running at http://127.0.0.1:2379
FAIL	k8s.io/kubernetes/test/integration/replicaset	77.660s

=== FAIL: test/integration/scheduler  (0.00s)
I0629 03:16:34.260033   25865 etcd.go:81] etcd already running at http://127.0.0.1:2379
FAIL	k8s.io/kubernetes/test/integration/scheduler	84.373s

=== FAIL: test/integration/volume  (0.00s)
I0629 03:20:53.646581   26343 etcd.go:81] etcd already running at http://127.0.0.1:2379
FAIL	k8s.io/kubernetes/test/integration/volume	122.375s

However, if I run each of these modules individually, I see that only one of the test cases fails from each module with the error

I0701 03:46:24.334858    2750 endpoint.go:68] ccResolverWrapper: sending new addresses to cc: [{http://127.0.0.1:2379  <nil> 0 <nil>}]
I0701 03:46:43.342529    2750 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing"
E0701 03:46:43.342551    2750 master_utils.go:197] error in bringing up the master: error building core storage: context deadline exceeded
W0701 03:46:43.342618    2750 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {http://127.0.0.1:2379  <nil> 0 <nil>}: didn't receive server preface in time. Reconnecting...
F0701 03:46:43.342680    2750 master_utils.go:199] error in bringing up the master: error building core storage: context deadline exceeded

and if that test case is run individually, it passes

How can I get the test suite to pass? Following are my specs

Specs:
Arch: x86_64
OS: Ubuntu 18.04
CPUs: 4
go version: go version go1.14.4 linux/amd64

Attaching log file for reference INTEL1.18-p12.log

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 17 (17 by maintainers)

Most upvoted comments

@RobertKielty this issue is not reflecting the failure we are seeing on https://testgrid.k8s.io/sig-release-master-blocking#integration-master. Please see https://github.com/kubernetes/test-infra/pull/18196#issuecomment-658000292. It may be good to open another issue to track that. I plan on submitting a fix today.

/close

hasheddan on Jul 15, 2020

@roycaihw I just tried running the following and the test cases passed successfully.

sed -i "s/--timeout=120/--timeout=300/" $GOPATH/src/k8s.io/kubernetes/hack/make-rules/test.sh
make test-cmd
make test-integration KUBE_TEST_ARGS="-p 1"

I guess this issue can be closed.

rajaskakodkar on Jul 14, 2020

Thank you @roycaihw for triaging this.

RobertKielty on Jul 14, 2020

@hasheddan @RobertKielty It seems that the failure of https://testgrid.k8s.io/sig-release-master-blocking#integration-master was caused by https://github.com/kubernetes/test-infra/pull/18196#issuecomment-657975786. It looks like that PR added some tests to the prowjob, which aren’t supposed to be run in parallel, and therefore hit the “etcd already running” error.

I think the sig-release-master-blocking#integration-master failure is unrelated to this issue. Since the failure was related to prowjob config change introduced two days ago, and this issue was about running integration tests manually

roycaihw on Jul 14, 2020

sig api-machinery

based on:

etcd already running at http://127.0.0.1:2379

but feel free to adjust.

neolit123 on Jul 2, 2020