kyverno: [Bug] Huge amount of log message "resource webhook configuration was deleted, recreating..."

Kyverno Version

1.6.x

Kubernetes Version

1.21.x

Kubernetes Platform

VMware Tanzu (specify in description)

Kyverno Rule Type

Validate

Description

Hi,

We deployed Kyverno on all of our lab clusters via helm in a HA configuration. Please find the values file attached to this issue. After deploying kyverno in this setup we noticed a huge increase in our logs.

The following message is logged several hundred times at irregular intervals: I0220 12:07:28.544914 1 configmanager.go:262] WebhookConfigManager/deleteWebhook "msg"="resource webhook configuration was deleted, recreating..."

Used values.yaml:

nameOverride:
fullnameOverride:
namespace: caas-kyverno

# -- Additional labels
customLabels: {}

rbac:
  create: true
  serviceAccount:
    create: true
    name:
    annotations: {}
    #   example.com/annotation: value

image:
  repository: some-repo.local/kyverno/kyverno
  # Defaults to appVersion in Chart.yaml if omitted
  tag:  # replaced in e2e tests
  pullPolicy: IfNotPresent
  pullSecrets: []
  # - secretName
initImage:
  repository: psome-repo.local/kyverno/kyvernopre
  # If initImage.tag is missing, defaults to image.tag
  tag:  # replaced in e2e tests
  # If initImage.pullPolicy is missing, defaults to image.pullPolicy
  pullPolicy:
  # No pull secrets just for initImage; just add to image.pullSecrets
testImage:
  # testImage.repository defaults to "busybox" if omitted
  repository: some-repo.local/google_containers/busybox:1.24
  # testImage.tag defaults to "latest" if omitted
  tag:
  # testImage.pullPolicy defaults to image.pullPolicy if omitted
  pullPolicy:

replicaCount: 3

podLabels: {}
#   example.com/label: foo

podAnnotations: {}
#   example.com/annotation: foo

podSecurityContext: {}

# Optional priority class to be used for kyverno pods
priorityClassName: ""

antiAffinity:
  # using this option will schedule pods on different nodes if possible
  enable: true

podAntiAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
            - kyverno
        topologyKey: kubernetes.io/hostname

podAffinity: {}

nodeAffinity: {}

podDisruptionBudget:
  minAvailable: 1
  # maxUnavailable: 1

  # minAvailable and maxUnavailable can either be set to an integer (e.g. 1)
  # or a percentage value (e.g. 25%)

nodeSelector: {}
tolerations: []

# change hostNetwork to true when you want the kyverno's pod to share its host's network namespace
# useful for situations like when you end up dealing with a custom CNI over Amazon EKS
# update the 'dnsPolicy' accordingly as well to suit the host network mode
hostNetwork: false

# dnsPolicy determines the manner in which DNS resolution happens in the cluster
# in case of hostNetwork: true, usually, the dnsPolicy is suitable to be "ClusterFirstWithHostNet"
# for further reference: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsPolicy: "ClusterFirst"

# env variables for initContainers
envVarsInit: {}

# env variables for containers
envVars: {}

extraArgs: []
# - --webhookTimeout=4

resources:
  limits:
    memory: 384Mi
  requests:
    cpu: 100m
    memory: 128Mi

initResources:
  limits:
    cpu: 100m
    memory: 256Mi
  requests:
    cpu: 10m
    memory: 64Mi

## Liveness Probe. The block is directly forwarded into the deployment, so you can use whatever livenessProbe configuration you want.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
##
livenessProbe:
  httpGet:
    path: /health/liveness
    port: 9443
    scheme: HTTPS
  initialDelaySeconds: 15
  periodSeconds: 30
  timeoutSeconds: 5
  failureThreshold: 2
  successThreshold: 1

## Readiness Probe. The block is directly forwarded into the deployment, so you can use whatever readinessProbe configuration you want.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
##
readinessProbe:
  httpGet:
    path: /health/readiness
    port: 9443
    scheme: HTTPS
  initialDelaySeconds: 5
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 6
  successThreshold: 1

# TODO(mbarrien): Should we just list all resources for the
# generatecontroller in here rather than having defaults hard-coded?
generatecontrollerExtraResources:
# - ResourceA
# - ResourceB

config:
  # resource types to be skipped by kyverno policy engine
  # Make sure to surround each entry in quotes so that it doesn't get parsed
  # as a nested YAML list. These are joined together without spaces in the configmap
  resourceFilters:
  - "[Event,*,*]"
  - "[*,kube-system,*]"
  - "[*,kube-public,*]"
  - "[*,kube-node-lease,*]"
  - "[Node,*,*]"
  - "[APIService,*,*]"
  - "[TokenReview,*,*]"
  - "[SubjectAccessReview,*,*]"
  - "[SelfSubjectAccessReview,*,*]"
  - "[*,kyverno,*]"
  - "[Binding,*,*]"
  - "[ReplicaSet,*,*]"
  - "[ReportChangeRequest,*,*]"
  - "[ClusterReportChangeRequest,*,*]"
  # Or give the name of an existing config map (ignores default/provided resourceFilters)
  existingConfig: ''
  excludeGroupRole:
#  - ""
  excludeUsername:
#  - ""
  # Webhookconfigurations, this block defines the namespaceSelector in the webhookconfigurations.
  # Note that it takes a list of namespaceSelector in the JSON format, and only the first element
  # will be forwarded to the webhookconfigurations.
  webhooks:
  # webhooks: [{"namespaceSelector":{"matchExpressions":[{"key":"environment","operator":"In","values":["prod"]}]}}]
  generateSuccessEvents: 'false'
  # existingConfig: kyverno
  metricsConfig:
    namespaces: {
      "include": [],
      "exclude": []
    }
    # 'namespaces.include': list of namespaces to capture metrics for. Default: metrics being captured for all namespaces except excludeNamespaces.
    # 'namespaces.exclude': list of namespaces to NOT capture metrics for. Default: []

    # metricsRefreshInterval: 24h
    # rate at which metrics should reset so as to clean up the memory footprint of kyverno metrics, if you might be expecting high memory footprint of Kyverno's metrics. Default: 0, no refresh of metrics

  # Or provide an existing metrics config-map by uncommenting the below line
  # existingMetricsConfig: sample-metrics-configmap. Refer to the ./templates/metricsconfigmap.yaml for the structure of metrics configmap.

## Deployment update strategy
## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
updateStrategy:
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 40%
  type: RollingUpdate

service:
  port: 443
  type: ClusterIP
  # Only used if service.type is NodePort
  nodePort:
  annotations: {}

topologySpreadConstraints: []

metricsService:
  create: true
  type: ClusterIP
  ## Kyverno's metrics server will be exposed at this port
  port: 8000
  ## The Node's port which will allow access Kyverno's metrics at the host level. Only used if service.type is NodePort.
  nodePort:
  ## Provide any additional annotations which may be required. This can be used to
  ## set the LoadBalancer service type to internal only.
  ## ref: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
  ##
  annotations: {}

# Service Monitor to collect Prometheus Metrics
serviceMonitor:
  enabled: false
  # Additional labels
  additionalLabels:
    # key: value
  # Override namespace (default is same than kyverno)
  namespace:

  # Interval to scrape metrics
  interval: 30s
  # Timeout if metrics can't be retrieved in given time interval
  scrapeTimeout: 25s
  # Is TLS required for endpoint
  secure: false
  # TLS Configuration for endpoint
  tlsConfig: {}

# Kyverno requires a certificate key pair and corresponding certificate authority
# to properly register its webhooks. This can be done in one of 3 ways:
# 1) Use kube-controller-manager to generate a CA-signed certificate (preferred)
# 2) Provide your own CA and cert.
#    In this case, you will need to create a certificate with a specific name and data structure.
#    As long as you follow the naming scheme, it will be automatically picked up.
#    kyverno-svc.(namespace).svc.kyverno-tls-ca (with data entry named rootCA.crt)
#    kyverno-svc.kyverno.svc.kyverno-tls-pair (with data entries named tls.key and tls.crt)
# 3) Let Helm generate a self signed cert, by setting createSelfSignedCert true
# If letting Kyverno create its own CA or providing your own, make createSelfSignedCert is false
createSelfSignedCert: false

# Whether to have Helm install the Kyverno CRDs
# If the CRDs are not installed by Helm, they must be added
# before policies can be created
installCRDs: true

# When true, use a NetworkPolicy to allow ingress to the webhook
# This is useful on clusters using Calico and/or native k8s network
# policies in a default-deny setup.
networkPolicy:
  enabled: false
  namespaceExpressions: []
  namespaceLabels: {}
  podExpressions: []
  podLabels: {}

Could you please verify if this is a misconfiguration in our deployment or if this is a bug?

Thank you!

Steps to reproduce

Use the specified values.yaml
helm install
One of the 3 kyverno pods logs a huge amount of webhook configuration details

Expected behavior

Fewer logs or a parameter where I can disable this kind of info log message.

Screenshots

No response

Kyverno logs

No response

Slack discussion

No response

Troubleshooting

I have read and followed the documentation AND the troubleshooting guide.
I have searched other issues in this repository and mine is not recorded.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 15 (5 by maintainers)

Most upvoted comments

@realshuting I just tested the fix in my local lab and it works! 👍 I can’t see these amount of Webhhok recreation logs anymore. Thank you very much!

thomasroot on Mar 21, 2022

@bryanasdev000 kyverno-verify-mutating-webhook-cfg and friends are recreated in an irregular interval, but not every 30s.

No we don’t get this error message: Error from server (InternalError): Internal error occurred: failed calling webhook "validate.kyverno.svc-fail": Post "https://kyverno-svc.kyverno.svc:443/validate?timeout=10s": context deadline exceeded

@chipzoller Right, everything is working fine at the moment. Its just a huge amount of logs regarding the webhook, that are generated every few minutes. And this doesn’t look like normal behaviour for me.

We didn’t change much in the values file - just the number of replicas: replicaCount: 3

Our clusters are provisioned by VMware Tanzu Kubernetes Grid integrated (TKGi) in Version 1.12.

thomasroot on Feb 22, 2022

Do you also get errors like:

Error from server (InternalError): Internal error occurred: failed calling webhook "validate.kyverno.svc-fail": Post "https://kyverno-svc.kyverno.svc:443/validate?timeout=10s": context deadline exceeded

When messing with the cluster?

Also, your kyverno-verify-mutating-webhook-cfg and friends keep recreating itself after 30s?

bryanasdev000 on Feb 21, 2022