kubernetes: [Failing Test] [sig-network] Services should only allow access from service loadbalancer source ranges [Slow] is failing in ci-kubernetes-e2e-gce-scale-correctness

Which jobs are failing:

  • ci-kubernetes-e2e-gce-scale-correctness

Which test(s) are failing: [sig-network] Services should only allow access from service loadbalancer source ranges [Slow]

Since when has it been failing: It has been failing since 8/28. Here is the k/k diff for additional context: https://github.com/kubernetes/kubernetes/compare/169876553...f0be44792

Testgrid link: https://testgrid.k8s.io/sig-release-master-informing#gce-master-scale-correctness

Reason for failure: Seems like they were two instances of the same test (at least in the latest run). One of them failed with the following error message:

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:152
Aug 30 13:08:39.388: Couldn't delete ns: "services-1078": namespace services-1078 was not deleted with limit: timed out waiting for the condition, namespaced content other than pods remain (&errors.errorString{s:"namespace services-1078 was not deleted with limit: timed out waiting for the condition, namespaced content other than pods remain"})
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:336

The other one with:

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:152
Aug 30 13:45:28.619: Couldn't delete ns: "services-1101": namespace services-1101 was not deleted with limit: timed out waiting for the condition, namespaced content other than pods remain (&errors.errorString{s:"namespace services-1101 was not deleted with limit: timed out waiting for the condition, namespaced content other than pods remain"})
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:336

/milestone v1.16 /priority critical-urgent /kind failing-test /cc @kubernetes/sig-scalability-test-failures /cc @kubernetes/sig-node-test-failures /cc @wojtek-t @shyamjvs

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 19 (18 by maintainers)

Most upvoted comments

Yep, this is related to https://github.com/kubernetes/kubernetes/pull/81691. With service finalizer enabled by default, deleting the namespace (including services) now takes longer — it now blocks on all the underlying load balancer resources to be cleaned up. It is expected that the latency for resource clean up might be longer on large cluster (such as in this test).

I will soon send a PR to overwrite the DefaultNamespaceDeletionTimeout with something more reasonable in this case.

@oxddr @lachie83 Acked, I’m looking into the failure and will provide an update soon.