kyverno: [Bug] Kyverno reports-controller can overload large clusters

Kyverno Version

1.6.0

Description

Kyverno’s admission controller can flood the K8s API plane with expensive requests. The offending code seems to be located in https://github.com/kyverno/kyverno/blob/f74c80f57bc31c19bc5871f15a62f518e234957e/pkg/controllers/report/admission/controller.go

for n := range ns {
			if n == "" {
				cadmrs, err := c.client.KyvernoV1alpha2().ClusterAdmissionReports().List(ctx, metav1.ListOptions{LabelSelector: selector.String()})
				//...
			} else {
				admrs, err := c.client.KyvernoV1alpha2().AdmissionReports(n).List(ctx, metav1.ListOptions{LabelSelector: selector.String()})

This generates heavy load on etcd:

  • the resourceVersion= parameter [1] is skipped, which forces apiserver to issue a full query against etcd each time
  • number of queries is multiplied by number of namespaces, which translates directly to etcd load (apiserver cannot use cache to serve the results cheaply).

It can be addressed by changing the above List() calls to set metav1.ListOptions{LabelSelector: selector.String(), ResourceVersion: "0"}). This will allow apiserver to pick the exact revision used for the results, which will make these queries substantially cheaper.

[1] https://kubernetes.io/docs/reference/using-api/api-concepts/#the-resourceversion-parameter

Slack discussion

No response

Troubleshooting

  • I have read and followed the documentation AND the troubleshooting guide.
  • I have searched other issues in this repository and mine is not recorded.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 21 (12 by maintainers)

Most upvoted comments

@PAM-PE the fix is not included in beta.1 i’m going to cut beta.2 waiting for two PRs to merge and i will push the tag.

@eddycharly when upgrading from 3.0.5 to 3.1.0-beta.1 helm chart version, pods of admission & reports controller are failing in CrashLoopBackOff. are you already aware of this or should I open an issue ? output.LOG

@wojtek-t @PAM-PE did you see https://github.com/kyverno/kyverno/blob/main/docs/dev/troubleshooting/reports.md ?

The reports aggregation is changing in 1.11, we are now creating one report per resource, this will cut the storage requirements a lot.

v1.11.0-beta.1 should be released today.