helm: unable to retrieve the complete list of server APIs
Output of helm version:
version.BuildInfo{Version:“v3.0+unreleased”, GitCommit:“180db556aaf45f34516f8ddb9ddac28d71736a3e”, GitTreeState:“clean”, GoVersion:“go1.13”}
Output of kubectl version:
lient Version: version.Info{Major:“1”, Minor:“15”, GitVersion:“v1.15.3”, GitCommit:“2d3c76f9091b6bec110a5e63777c332469e0cba2”, GitTreeState:“clean”, BuildDate:“2019-08-19T12:36:28Z”, GoVersion:“go1.12.9”, Compiler:“gc”, Platform:“darwin/amd64”}
Server Version: version.Info{Major:“1”, Minor:“15”, GitVersion:“v1.15.3+IKS”, GitCommit:“66a72e7aa8fd2dbf64af493f50f943d7f7067916”, GitTreeState:“clean”, BuildDate:“2019-08-23T08:07:38Z”, GoVersion:“go1.12.9”, Compiler:“gc”, Platform:“linux/amd64”}
Cloud Provider/Platform (AKS, GKE, Minikube etc.): IBM Cloud
Helm chart deployment fails with:
➜ charts git:(h2update2) helm install vdc -f ~/etc/cloud-noes.yaml vdc <<<
coalesce.go:155: warning: skipped value for image: Not a table.
Error: could not get apiVersions from Kubernetes: unable to retrieve the complete list of server APIs: custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
(The first error is in a confluent chart… here I discuss the second issue)
Looking at the error I see a similar problem with
➜ charts git:(h2update2) kubectl api-resources
NAME SHORTNAMES APIGROUP NAMESPACED KIND
bindings true Binding
componentstatuses cs false ComponentStatus
configmaps cm true ConfigMap
endpoints ep true Endpoints
events ev true Event
limitranges limits true LimitRange
namespaces ns false Namespace
nodes no false Node
persistentvolumeclaims pvc true PersistentVolumeClaim
persistentvolumes pv false PersistentVolume
pods po true Pod
podtemplates true PodTemplate
replicationcontrollers rc true ReplicationController
resourcequotas quota true ResourceQuota
secrets true Secret
serviceaccounts sa true ServiceAccount
services svc true Service
mutatingwebhookconfigurations admissionregistration.k8s.io false MutatingWebhookConfiguration
validatingwebhookconfigurations admissionregistration.k8s.io false ValidatingWebhookConfiguration
customresourcedefinitions crd,crds apiextensions.k8s.io false CustomResourceDefinition
apiservices apiregistration.k8s.io false APIService
controllerrevisions apps true ControllerRevision
daemonsets ds apps true DaemonSet
deployments deploy apps true Deployment
replicasets rs apps true ReplicaSet
statefulsets sts apps true StatefulSet
meshpolicies authentication.istio.io false MeshPolicy
policies authentication.istio.io true Policy
tokenreviews authentication.k8s.io false TokenReview
localsubjectaccessreviews authorization.k8s.io true LocalSubjectAccessReview
selfsubjectaccessreviews authorization.k8s.io false SelfSubjectAccessReview
selfsubjectrulesreviews authorization.k8s.io false SelfSubjectRulesReview
subjectaccessreviews authorization.k8s.io false SubjectAccessReview
horizontalpodautoscalers hpa autoscaling true HorizontalPodAutoscaler
metrics autoscaling.internal.knative.dev true Metric
podautoscalers kpa,pa autoscaling.internal.knative.dev true PodAutoscaler
cronjobs cj batch true CronJob
jobs batch true Job
images img caching.internal.knative.dev true Image
certificatesigningrequests csr certificates.k8s.io false CertificateSigningRequest
certificates cert,certs certmanager.k8s.io true Certificate
challenges certmanager.k8s.io true Challenge
clusterissuers certmanager.k8s.io false ClusterIssuer
issuers certmanager.k8s.io true Issuer
orders certmanager.k8s.io true Order
adapters config.istio.io true adapter
attributemanifests config.istio.io true attributemanifest
handlers config.istio.io true handler
httpapispecbindings config.istio.io true HTTPAPISpecBinding
httpapispecs config.istio.io true HTTPAPISpec
instances config.istio.io true instance
quotaspecbindings config.istio.io true QuotaSpecBinding
quotaspecs config.istio.io true QuotaSpec
rules config.istio.io true rule
templates config.istio.io true template
leases coordination.k8s.io true Lease
brokers eventing.knative.dev true Broker
channels chan eventing.knative.dev true Channel
clusterchannelprovisioners ccp eventing.knative.dev false ClusterChannelProvisioner
eventtypes eventing.knative.dev true EventType
subscriptions sub eventing.knative.dev true Subscription
triggers eventing.knative.dev true Trigger
events ev events.k8s.io true Event
daemonsets ds extensions true DaemonSet
deployments deploy extensions true Deployment
ingresses ing extensions true Ingress
networkpolicies netpol extensions true NetworkPolicy
podsecuritypolicies psp extensions false PodSecurityPolicy
replicasets rs extensions true ReplicaSet
channels ch messaging.knative.dev true Channel
choices messaging.knative.dev true Choice
inmemorychannels imc messaging.knative.dev true InMemoryChannel
sequences messaging.knative.dev true Sequence
nodes metrics.k8s.io false NodeMetrics
pods metrics.k8s.io true PodMetrics
certificates kcert networking.internal.knative.dev true Certificate
clusteringresses networking.internal.knative.dev false ClusterIngress
ingresses ing networking.internal.knative.dev true Ingress
serverlessservices sks networking.internal.knative.dev true ServerlessService
destinationrules dr networking.istio.io true DestinationRule
envoyfilters networking.istio.io true EnvoyFilter
gateways gw networking.istio.io true Gateway
serviceentries se networking.istio.io true ServiceEntry
sidecars networking.istio.io true Sidecar
virtualservices vs networking.istio.io true VirtualService
ingresses ing networking.k8s.io true Ingress
networkpolicies netpol networking.k8s.io true NetworkPolicy
poddisruptionbudgets pdb policy true PodDisruptionBudget
podsecuritypolicies psp policy false PodSecurityPolicy
clusterrolebindings rbac.authorization.k8s.io false ClusterRoleBinding
clusterroles rbac.authorization.k8s.io false ClusterRole
rolebindings rbac.authorization.k8s.io true RoleBinding
roles rbac.authorization.k8s.io true Role
authorizationpolicies rbac.istio.io true AuthorizationPolicy
clusterrbacconfigs rbac.istio.io false ClusterRbacConfig
rbacconfigs rbac.istio.io true RbacConfig
servicerolebindings rbac.istio.io true ServiceRoleBinding
serviceroles rbac.istio.io true ServiceRole
priorityclasses pc scheduling.k8s.io false PriorityClass
configurations config,cfg serving.knative.dev true Configuration
revisions rev serving.knative.dev true Revision
routes rt serving.knative.dev true Route
services kservice,ksvc serving.knative.dev true Service
apiserversources sources.eventing.knative.dev true ApiServerSource
awssqssources sources.eventing.knative.dev true AwsSqsSource
containersources sources.eventing.knative.dev true ContainerSource
cronjobsources sources.eventing.knative.dev true CronJobSource
githubsources sources.eventing.knative.dev true GitHubSource
kafkasources sources.eventing.knative.dev true KafkaSource
csidrivers storage.k8s.io false CSIDriver
csinodes storage.k8s.io false CSINode
storageclasses sc storage.k8s.io false StorageClass
volumeattachments storage.k8s.io false VolumeAttachment
clustertasks tekton.dev false ClusterTask
pipelineresources tekton.dev true PipelineResource
pipelineruns pr,prs tekton.dev true PipelineRun
pipelines tekton.dev true Pipeline
taskruns tr,trs tekton.dev true TaskRun
tasks tekton.dev true Task
error: unable to retrieve the complete list of server APIs: custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
➜ charts git:(h2update2)
Then looking at ‘action.go’ in the source I can see that if this api call fails, we exit getCapabilities(). I understand why … but is this failure too ‘hard’ - in the case above the error was a minor service?
This seems to have come up recently due to some changes on the k8s service with metrics. I will persue that seperately… but was after thoughts on how helm handles this situation Also a heads up helm3 may be broken on IKS - but I’m not knowledgeable enough to dig much further?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 23
- Comments: 63 (19 by maintainers)
Commits related to this issue
- fix: stop discovery errors from halting chart rendering. This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solution, but is... — committed to technosophos/k8s-helm by technosophos 5 years ago
- fix: stop discovery errors from halting chart rendering. This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solution, but is... — committed to technosophos/k8s-helm by technosophos 5 years ago
- fix: stop discovery errors from halting chart rendering. This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solution, but is... — committed to technosophos/k8s-helm by technosophos 5 years ago
- fix: stop discovery errors from halting chart rendering. (#6908) This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solutio... — committed to helm/helm by technosophos 5 years ago
- Work around for an issue in Helm Links: - https://github.com/helm/helm/issues/6361 — committed to cloudfoundry-community/eirini-on-microk8s by giner 5 years ago
- fix: stop discovery errors from halting chart rendering. This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solution, but is... — committed to helm/helm by technosophos 5 years ago
- fix: stop discovery errors from halting chart rendering. (#6908) This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solutio... — committed to celtra/helm by technosophos 5 years ago
- Change the order of checking cert-manager is ready before moving on We currently install cert-manager, then just install openfaas chart then we check if cert-manager is ready. This leads to openfaas ... — committed to Waterdrips/ofc-bootstrap by Waterdrips 4 years ago
- fix: backport #6361 fix to Helm 2 — committed to Ewocker/helm by Ewocker 4 years ago
- fix: backport #6361 fix to Helm 2 — committed to proofpoint/helm by Ewocker 4 years ago
- workaround for broken api-resources call kubectl api-resources is easily broken, e.g. https://access.redhat.com/solutions/4379461 https://github.com/helm/helm/issues/6361#issuecomment-538220109 But... — committed to retzkek/tanka by retzkek 3 years ago
- Handle GroupDiscoveryFailedError in proxy.go (#5596) Similar to the fix in helm (https://github.com/helm/helm/issues/6361), this fix allows GroupDiscoveryFailedError to not error out the process of m... — committed to gabe-l-hart/operator-sdk by gabe-l-hart 2 years ago
- ansible: Handle GroupDiscoveryFailedError in proxy.go (#5596) Similar to the fix in helm (https://github.com/helm/helm/issues/6361), this fix allows GroupDiscoveryFailedError to not error out the pro... — committed to gabe-l-hart/operator-sdk by gabe-l-hart 2 years ago
For anyone who hits this, it’s caused by api-services that no longer have backends running…
In my case it was KEDA, but there are a number of different services that install aggregated API servers.
To fix it:
Look for ones the
AVAILABLEisFalseIf you don’t need those APIs any more, delete them:
Then Helm should work properly. I think improving the Helm error message for this case may be worthwhile…
I have the same issue on AKS, though the error message is
my config :
helm version: alpine/helm:3.0.0-beta.2 (docker)
kubectl api-resources
Solution:
The steps I followed are:
kubectl get apiservices: If metric-server service is down with the error CrashLoopBackOff try to follow the step 2 otherwise just try to restart the metric-server service usingkubectl delete apiservice/"service_name". For me it was v1beta1.metrics.k8s.io .kubectl get pods -n kube-systemand found out that pods like metrics-server, kubernetes-dashboard are down because of the main coreDNS pod was down.For me it was:
kubectl describe pod/"pod_name"to check the error in coreDNS pod and if it is down because of /etc/coredns/Corefile:10 - Error during parsing: Unknown directive proxy, then we need to use forward instead of proxy in the yaml file where coreDNS config is there. Because CoreDNS version 1.5x used by the image does not support the proxy keyword anymore.https://stackoverflow.com/questions/62442679/could-not-get-apiversions-from-kubernetes-unable-to-retrieve-the-complete-list
Here we have two different cases with the same behavior from helm side. Both
2.15.1and3 betaversions are affected.As @technosophos mentioned helm uses discovery API functionality and fails if any of API response fails https://github.com/helm/helm/blob/f1dc84773f9a34fe59a504fdb32428ce1d56a2e8/pkg/action/action.go#L105-L118
admission.certmanager.k8s.io/v1beta1is a good example:and for this case you can easily fix it by
kubectl delete apiservice v1beta1.admission.certmanager.k8s.ioas @brendandburns described.Currently, it’s alive and running but was down accidentally during the helm’s request.
I’m sure that helm must be more robust for such type of issues,
I have just seen this issue myself. In my case it was cert-manager that triggered the problem. Still working on how to get it back to how it was.
Started hitting this recently in freshly created GKE clusters, using 2.15.1 (might have upgraded recently via Snap). Also reported as https://github.com/kubernetes/kubernetes/issues/72051#issuecomment-521157642. Seem to be able to work around by preceding every
helm installcommand with:We have similar issue with 2.15.1 on Kubernetes 1.15.5, but NOT with helm 2.14.3.
The issue is floating: some charts are installed OK, but then they begin to fail. Our message is:
kubectl get apiservicelistsmetrics.k8s.io/v1beta1as available. May be we have transient issue with this service, but helm 2.14.3 on mostly identical cluster works reliably.the fix was merged into the master branch, but it wasn’t merged into 3.0.0. The patch will be in 3.1.
Would be great to see this incorporated into Helm 2 and the Terraform provider. I’m able to repo this error every time a cluster is created.
If anyone is available to test the Helm 2 fix, it’s here: #7196
Do we have a fix for 2.15/2.16 ?
If this does also affect 2.x then everyone using “cert-manager” (possibly only pre-configuration) is going to have a bad time.
Hi @technosophos. May be I’m missed something, but I don’t see this PR https://github.com/helm/helm/pull/6908/files being ported to 2.16.3, although it happens with helm 2 as well. Are you planning port this workaround in helm 2 as well?
Sure.
Helm: 3.0.0 K8s: 1.14.8
helm delete prom -n monitoringended with the following errorAfter that, helm release got disappeared from the list of Helm releases and all objects related to that Prometheus operator became orphaned.
Ok, I see, it might be a version issue. Will upgrade Helm to most recent version 3.0.2 asap.
I’ve just faced this issue doing
helm delete. It caused a very bad effect. The Helm release got removed, but all K8s objects were kept running in the cluster. So we had to remove everything by hand. And as it was an operator, this action required a significant effort.@technosophos what’s the fix for this? Should we grep
kubectl get apiserviceand then block until all services are in aReadystate? Is there something else we could do instead?We’re working on an OSS tool which installs a number of helm charts to bootstrap a system and this problem appears to be causing the whole process to fail intermittently.
I’m guessing there was a typo in the instructions. Probably should be
kubectl delete apiservice(missing i in service)After resolving the issue from the metrics pod (can’t remember how I solved it, i think it might have to do with hostNetwork or simply restarting the associated pod) helm3 function as expected. So it might be a ‘feature’ as it forces to maintain the cluster in good health, but it’ll require someone to manually go in the cluster each time an api break (and thus might prevent using helm3 to deploy pods able to be listed on this).
Yes, this is most definitely a version mismatch issue. This patch was made available in 3.0.2. In the future, please make sure to test with the latest patch release (or, better yet, on master). Thanks!
@jglick the implementation I did yesterday will very likely avoid the problem for you unless you are writing charts that directly reference the offending API group.
It’s really, really annoying as someone starting out with Kubernetes. I’m hand rolling a solution for certificates using acme, since I can’t guarantee that cert manager won’t still be broken even after configuring it.
The really annoying part is I can’t just use helm to uninstall cert manager and get back to where I was! Anything which allows a strongly recommended service to break it, and won’t undo the change is broken.