kyverno: [Bug] Deploying kyverno-policies chart with custom values fails

Kyverno Version

1.10.1

Description

We are trying to apply kyverno policy using the helm chart. But the helmrelease is failing with the error

message: “Helm install failed: 17 errors occurred:\n\t* Internal error occurred: failed calling webhook "mutate-policy.kyverno.svc": failed to call webhook: Post "https://kyverno-svc.security.svc:443/policymutate?timeout=10s\”: no endpoints available for service "kyverno-svc"\n\t* Internal error occurred

But creating the kyverno policy through manifest is working fine

kyverno-policy chart version used: 3.0.1

values:
    podSecurityStandard: baseline
    podSecuritySeverity: medium
    includeRestrictedPolicies:
    - disallow-privilege-escalation
    - disallow-capabilities-strict
    - require-run-as-nonroot
    - require-run-as-non-root-user
    - restrict-seccomp-strict
    - restrict-volume-types
    failurePolicy: Ignore
    validationFailureAction: audit
    validationFailureActionByPolicy:
      disallow-host-namespaces: enforce
    background: true

Slack discussion

No response

Troubleshooting

  • I have read and followed the documentation AND the troubleshooting guide.
  • I have searched other issues in this repository and mine is not recorded.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 32 (15 by maintainers)

Most upvoted comments

Hi,

The wget produces the following:

/ # wget --no-check-certificate --spider --timeout=1 https://kyverno-svc.kyverno.svc:443/health/liveness
Connecting to kyverno-svc.kyverno.svc:443 (10.245.78.167:443)
wget: TLS error from peer (alert code 40): handshake failure
wget: error getting response: Connection reset by peer

The same check with curl gives a HTTP 200:

~ $ curl https://kyverno-svc.kyverno.svc:443/health/liveness -kv
* processing: https://kyverno-svc.kyverno.svc:443/health/liveness
*   Trying 10.245.78.167:443...
* Connected to kyverno-svc.kyverno.svc (10.244.1.166) port 443
* ALPN: offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=kyverno-svc
*  start date: Aug 14 12:51:49 2023 GMT
*  expire date: Jan 11 13:51:49 2024 GMT
*  issuer: CN=*.kyverno.svc
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* using HTTP/2
* h2 [:method: GET]
* h2 [:scheme: https]
* h2 [:authority: kyverno-svc.kyverno.svc]
* h2 [:path: /health/liveness]
* h2 [user-agent: curl/8.2.1]
* h2 [accept: */*]
* Using Stream ID: 1
> GET /health/liveness HTTP/2
> Host: kyverno-svc.kyverno.svc
> User-Agent: curl/8.2.1
> Accept: */*
>
< HTTP/2 200
< content-length: 0
< date: Tue, 15 Aug 2023 14:19:42 GMT
<
* Connection #0 to host kyverno-svc.kyverno.svc left intact

The issue in my case was not the metrics server. I dont know why the error related to metrics server appear in the kyverno admission controller pod logs. This is something that started appearing after 3.0.0 ( keyverno version 2.10 ) and need to be addressed

After analyzing I found the issue to be: when we were creating more kyverno policy at once, it was hitting the max CPU limit and kyverno container was getting restarted. We were using flux to create around 30 kyverno policies and so all policies were created at once which throttled the CPU when requests came to kyverno pod in bulk.

Increasing the CPU limit resolved our issue