kyverno: [Bug] High CPU usage by reports controller

Kyverno Version

1.10.2

Description

Hi! On some of my clusters reports controller is using persistently over 2 vCPU’s. Background scan interval is set to 12h. I compared two clusters of the same size and Kyverno config, one with persistent usage of 2.3 vCPU, the other with usage of 0.2 vCPU. The difference in logs generated by the two clusters is significant:

  • Low CPU cluster generates ~992 logs/min
  • High CPU cluster generates ~1976 logs/min

I compared the logs it seems to me that the reports-controller on High CPU cluster is giving lots more logs: background-scan-controller/worker "msg"="reconciling ..." interchanging with: background-scan-controller/worker "msg"="done"

This is for extended period of time, definitely not in spikes. I wonder if the reports controller on the High CPU cluster got stuck in the background scanning mode, doing the scan continuously.

Note: Deleting the pod and allowing for Kyverno deployment to recreate it brings down the CPU usage to normal level (after the initial CPU spike). I observed some clusters where the issue reoccurred 2 days after the restart, while for others restart helped and the issue hasn’t reoccurred yet.

Logs coming from high CPU and low CPU clusters attached. high-cpu.log low-cpu.log

Slack discussion

No response

Troubleshooting

  • I have read and followed the documentation AND the troubleshooting guide.
  • I have searched other issues in this repository and mine is not recorded.

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Reactions: 3
  • Comments: 21 (9 by maintainers)

Most upvoted comments

@realshuting - please wait, I meant the endless loop of validating webhook configurations! 😃 Allow me some time to test the 1.11.0 as I haven’t upgraded yet!

Can confirm same issue:

Screenshot from 2023-10-02 10-02-07

Running strace on single one of the threads there seems like a repeating pattern. Some read with EAGAIN and then tgkill, then repeats with write and read EAGAIN (socket empty?)

image