kubernetes: Kubelet CPU perf regression 1.20->1.21 (10-15%)

/cc @ehashman /cc @bobbypage

What happened:

1.21 kubelet shows perf degradation on our tests. Also here https://bugzilla.redhat.com/show_bug.cgi?id=1953102.

The root cause seems to be runc, should be fixed here: https://github.com/kubernetes/kubernetes/pull/101888.

This issue is to discuss backporting of the fix to 1.21 release.

What you expected to happen:

Kubelet doesn’t show significant change in CPU between 1.20 and 1.21.

How to reproduce it (as minimally and precisely as possible):

No special set up required. Just perf observation.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): 1.21

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 27 (26 by maintainers)

Most upvoted comments

Only a few data points are available at this point, but there seems to be vdegradation between 1.20 and 1.21.

@odinuge My performance analysis (of runc’s libcontainers/cgroups/fs.GetStats) are at https://github.com/opencontainers/runc/pull/2921. I haven’t done any testing with Kubernetes and/or cAdvisor, only relying on info gathered in https://bugzilla.redhat.com/show_bug.cgi?id=1953102

How can the performance regression in runc between rc92 (in Kubernetes 1.20) and rc93 (in Kubernetes 1.21) be resolved in a way that doesn’t require bumping to rc94?

This would require a fork of runc with the backport of https://github.com/opencontainers/runc/pull/2921.

I think it makes more sense to fix rc94 test failures (looking at it).