kubernetes: 1.10.x upgrade causes api-server to consume a lot of resources and eventually oom

/kind bug /sig api-machinery

What happened:

Upgrade from 1.9 (either 1.9.3 or 1.9.6) to 1.10 (either 1.10.0 or 1.10.1) - after a few hours of running, api server, starts throttling cpu at 100% and consuming more and more memory. Only error logs:

E0413 12:45:48.549420       1 authentication.go:63] Unable to authenticate the request due to an error: [invalid bearer token, [invalid bearer token, Token has been invalidated]]
E0413 12:45:48.549680       1 errors.go:90] no context found for request

Other symptoms are:

  • slow api server responses
  • nodes becoming NotReady

What you expected to happen:

Things to work.

How to reproduce it (as minimally and precisely as possible):

Update version of hyperkube 1.9 > 1.10

Anything else we need to know?:

Environment:

etcd: v3.3.2

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.1+coreos.0", GitCommit:"baafb306bb191971a84cb1796420d093de7e6014", GitTreeState:"clean", BuildDate:"2018-04-12T21:14:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

AWS 3 etcd instances 3 masters in ASG 3 workers in ASG

NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1688.5.3
VERSION_ID=1688.5.3
BUILD_ID=2018-04-03-0547
PRETTY_NAME="Container Linux by CoreOS 1688.5.3 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"
Linux ip-10-66-23-26 4.14.32-coreos #1 SMP Tue Apr 3 05:21:26 UTC 2018 x86_64 Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz GenuineIntel GNU/Linux

2018-04-13-134312_2450x269_scrot

Our api-server pod definition: https://github.com/utilitywarehouse/tf_kube_ignition/blob/master/resources/kube-apiserver.yaml

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 4
  • Comments: 24 (20 by maintainers)

Commits related to this issue

Most upvoted comments

#64153 is merged, will be in 1.10.4, planned for 6/6