kubernetes: Deployment Integration Test Goroutine Limit Exceeded

What happened: When number of deployment integration tests increases more than a threshold, an error running race: limit on 8192 simultaneously alive goroutines is exceeded, dying happens when running the tests locally using bazel. The error does not happen when the number of tests is small.

What you expected to happen: The integration tests should not create so many alive goroutines (more than 8192).

How to reproduce it (as minimally and precisely as possible): (1) Duplicate each deployment tests under test/integration/deployment directory twice with a digit identifier (2) bazel build //test/integration/deployment/... (3) bazel test //test/integration/deployment/...

Anything else we need to know?: The error also happens for replicaset. It may be related to how integration test environment is set up.

/kind bug /sig apps

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 19 (13 by maintainers)

Most upvoted comments

The goroutines sampled above seem to be part of the client sitting between the REST Store and etcd. Perhaps it will help to incorporate calls to DestroyFunc somewhere in the integration framework?

https://github.com/kubernetes/kubernetes/blob/77c8b6eadfed09c471e5d3ce20bc3480189cde2d/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go#L173-L174

@kubernetes/sig-api-machinery-bugs Is it expected that an integration test would exceed 8192 goroutines (mostly started in apiserver code) if it starts a number of apiservers? That seems excessive to me, but if it’s normal we should probably limit the concurrency of integration tests. If it’s not normal, it seems like we are leaking goroutines.

Some examples of what those 8192 goroutines are doing:

k8s.io/client-go/tools/cache.(*Reflector).watchHandler(0xc420f625a0, 0xa8a2700, 0xc4210f5800, 0xc423939ba0, 0xc4210f55c0, 0xc420f485a0, 0x0, 0x0)
        vendor/k8s.io/client-go/tools/cache/reflector.go:366 +0x16f2
k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc420f625a0, 0xc420f485a0, 0x0, 0x0)
        vendor/k8s.io/client-go/tools/cache/reflector.go:332 +0x1560
k8s.io/apiserver/pkg/storage.(*Cacher).startCaching(0xc4209121c0, 0xc420f485a0)
        vendor/k8s.io/apiserver/pkg/storage/cacher.go:276 +0x1a4
k8s.io/apiserver/pkg/storage.NewCacherFromConfig.func1.1()
        vendor/k8s.io/apiserver/pkg/storage/cacher.go:245 +0x80
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc42003e7a8)
        vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x70
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc423939fa8, 0x3b9aca00, 0x0, 0xc42003e701, 0xc420f485a0)
        vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xce
k8s.io/apimachinery/pkg/util/wait.Until(0xc42003e7a8, 0x3b9aca00, 0xc420f485a0)
        vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x5b
k8s.io/apiserver/pkg/storage.NewCacherFromConfig.func1(0xc4209121c0, 0xc420f485a0)
        vendor/k8s.io/apiserver/pkg/storage/cacher.go:248 +0xe3
created by k8s.io/apiserver/pkg/storage.NewCacherFromConfig
        vendor/k8s.io/apiserver/pkg/storage/cacher.go:249 +0xfa7
k8s.io/apiserver/pkg/storage.(*Cacher).dispatchEvents(0xc4209121c0)
        vendor/k8s.io/apiserver/pkg/storage/cacher.go:595 +0x24a
created by k8s.io/apiserver/pkg/storage.NewCacherFromConfig
        vendor/k8s.io/apiserver/pkg/storage/cacher.go:237 +0xf54
github.com/coreos/etcd/clientv3.(*lessor).deadlineLoop(0xc4206948c0)
        vendor/github.com/coreos/etcd/clientv3/lease.go:434 +0x2fd
created by github.com/coreos/etcd/clientv3.NewLease
        vendor/github.com/coreos/etcd/clientv3/lease.go:156 +0x4da