karpenter-provider-aws: Webhook Certificate Error when upgrading to v0.19.0

Version

Karpenter Version: v0.19.0

Kubernetes Version: v1.23.0

Expected Behavior

I expected a clean upgrade but got this errors. Karpenter still seems to be working but this error troubles me. Can you please describe how this error can effect me and how to fix this?

Actual Behavior

I am getting the errors below after upgrading from v0.18.1 to v0.19.0 via helm on aws eks

Steps to Reproduce the Problem

upgrade from v0.18.1 to v0.19.0

Resource Specs and Logs

2022-11-17T13:09:22.957Z	ERROR	controller	Reconciler error	{"commit": "ea5dc14", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "Provisioner": {"name":"gp-karpenter"}, "namespace": "", "name": "gp-karpenter", "reconcileID": "61c4a81e-7129-4582-bb04-3ea764164616", "error": "Internal error occurred: failed calling webhook \"defaulting.webhook.provisioners.karpenter.sh\": failed to call webhook: Post \"https://karpenter.karpenter.svc:443/default-resource?timeout=10s\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"karpenter.karpenter.svc\")"}

2022/11/17 13:09:22 http: TLS handshake error from 10.102.60.28:44900: remote error: tls: bad certificate

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave โ€œ+1โ€ or โ€œme tooโ€ comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 9
  • Comments: 15 (6 by maintainers)

Most upvoted comments

I was migrating from 0.18.0 to 0.19.2 (using ArgoCD) and got this issue, but was able to solve it by deleting these 2 webhooks:

  • kubectl delete mutatingwebhookconfigurations.admissionregistration.k8s.io defaulting.webhook.provisioners.karpenter.sh
  • kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io validation.webhook.provisioners.karpenter.sh

Now itโ€™s all good. Thanks for the pointers ๐Ÿ‘

Weโ€™ve seen issues with argocd where it fails to uninstall validation webhooks from older versions. In this upgrade, we made changes to the validating webhooks โ€“ this upgrade is successfully supported by helm, but for some reason, it looks like ArgoCD has leaked validation.webhook.provisioners.karpenter.sh .

Mine are

k get validatingwebhookconfigurations.admissionregistration.k8s.io
NAME                                     WEBHOOKS   AGE
validation.webhook.config.karpenter.sh   1          7s
validation.webhook.karpenter.k8s.aws     1          7s
validation.webhook.karpenter.sh          1          7s

This leaked webhook will block provisioning updates. Try deleting it.