calico: Couldn't get resource list for projectcalico.org/v3: the server is currently unable to handle the request

@caseydavenport Your request was to raise a separate issue.

I installed the latest version of calico using this helm chart. The kube-apiserver-kmaster1 returns the following error in the logs: v3.projectcalico.org failed with: failing or missing response from https://**:443/apis/projectcalico.org/v3.

Also after each random kubectl command it returns errors about these CRDs.

E0406 15:33:42.335160   55793 memcache.go:106] couldn't get resource list for projectcalico.org/v3: the server is currently unable to handle the request
NAME       STATUS   ROLES           AGE   VERSION
kmaster1   Ready    control-plane   19d   v1.26.3
kworker1   Ready    <none>          18d   v1.26.3
kworker2   Ready    <none>          18d   v1.26.3

These CRDs are automitcally installed using the helm chart stated above.

--> k api-resources | grep calico
E0406 15:34:26.465805   55853 memcache.go:255] couldn't get resource list for projectcalico.org/v3: the server is currently unable to handle the request
E0406 15:34:26.481896   55853 memcache.go:106] couldn't get resource list for projectcalico.org/v3: the server is currently unable to handle the request
bgpconfigurations                              crd.projectcalico.org/v1               false        BGPConfiguration
bgppeers                                       crd.projectcalico.org/v1               false        BGPPeer
blockaffinities                                crd.projectcalico.org/v1               false        BlockAffinity
caliconodestatuses                             crd.projectcalico.org/v1               false        CalicoNodeStatus
clusterinformations                            crd.projectcalico.org/v1               false        ClusterInformation
felixconfigurations                            crd.projectcalico.org/v1               false        FelixConfiguration
globalnetworkpolicies                          crd.projectcalico.org/v1               false        GlobalNetworkPolicy
globalnetworksets                              crd.projectcalico.org/v1               false        GlobalNetworkSet
hostendpoints                                  crd.projectcalico.org/v1               false        HostEndpoint
ipamblocks                                     crd.projectcalico.org/v1               false        IPAMBlock
ipamconfigs                                    crd.projectcalico.org/v1               false        IPAMConfig
ipamhandles                                    crd.projectcalico.org/v1               false        IPAMHandle
ippools                                        crd.projectcalico.org/v1               false        IPPool
ipreservations                                 crd.projectcalico.org/v1               false        IPReservation
kubecontrollersconfigurations                  crd.projectcalico.org/v1               false        KubeControllersConfiguration
networkpolicies                                crd.projectcalico.org/v1               true         NetworkPolicy
networksets                                    crd.projectcalico.org/v1               true         NetworkSet
error: unable to retrieve the complete list of server APIs: projectcalico.org/v3: the server is currently unable to handle the request

Do I understand it correctly that these crd.projectcalico.org/v1 CRDs are still needed - so not deleting them - and I need to manually install the v3 CRDs? If so, where can I download these v3 CRDs as I can’t find it

I believe chet-tuttle-3 is facing some similar issues

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 7
  • Comments: 48 (11 by maintainers)

Most upvoted comments

For some reason, the calico-apiserver pod is failing on liveness probes because the apiserver is not starting correctly or something is not working at all, due to that, the apiservice is getting reported as FailedDiscoveryCheck . I tried to play around with deployment and other things but wasn’t able to achieve something. Is there any way to enable debug logs for apiserver?

I also saw that csi nodedriver for calico was failing with following reason:

kubectl logs -f -n calico-system csi-node-driver-pzwsl -c csi-node-driver-registrar
/usr/local/bin/node-driver-registrar: error while loading shared libraries: libresolv.so.2: cannot open shared object file: No such file or directory

@headyj added this security rule in the eks cluster:

 node_security_group_additional_rules = {
    # calico-apiserver
    ingress_cluster_5443_webhook = {
      description                   = "Cluster API to node 5443/tcp webhook"
      protocol                      = "tcp"
      from_port                     = 5443
      to_port                       = 5443
      type                          = "ingress"
      source_cluster_security_group = true
    }
  }

The only apiservice that has a status of false is v3.projectcalico.org that has the following error message: failing or missing response from https://***:443/apis/projectcalico.org/v3: Get "https://***:443/apis/projectcalico.org/v3": context deadline exceeded

➜ ~ kubectl describe apiservice v3.projectcalico.org

Name:         v3.projectcalico.org
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata: 
  Creation Timestamp:  2023-04-06T12:54:20Z
  Managed Fields:
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .:
          k:{"uid":"***"}:
      f:spec:
        f:caBundle:
        f:group:
        f:groupPriorityMinimum:
        f:service:
          .:
          f:name:
          f:namespace:
          f:port:
        f:version:
        f:versionPriority:
    Manager:      operator
    Operation:    Update
    Time:         2023-04-06T12:54:20Z
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          .:
          k:{"type":"Available"}:
            .:
            f:lastTransitionTime:
            f:message:
            f:reason:
            f:status:
            f:type:
    Manager:      kube-apiserver
    Operation:    Update
    Subresource:  status
    Time:         2023-04-18T12:35:03Z
  Owner References:
    API Version:           operator.tigera.io/v1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  APIServer
    Name:                  default
    UID:                   ***
  Resource Version:       ***
  UID:                     ***
Spec:
  Ca Bundle:              ***
  Group:                   projectcalico.org
  Group Priority Minimum:  1500
  Service:
    Name:            calico-api
    Namespace:       calico-apiserver
    Port:            443
  Version:           v3
  Version Priority:  200
Status:
  Conditions:
    Last Transition Time:  2023-04-06T12:54:20Z
    Message:               failing or missing response from https://10.107.208.239:443/apis/projectcalico.org/v3: Get "https://10.107.208.239:443/apis/projectcalico.org/v3": dial tcp 10.107.208.239:443: i/o timeout
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

The logs of the calico-apiservice looks like this: ➜ ~ kubectl logs --tail=-1 -n calico-apiserver -l k8s-app=calico-apiserver

Version:      v3.25.1
Build date:   2023-03-30T23:52:23+0000
Git tag ref:  v3.25.1
Git commit:   82dadbce1
I0413 15:15:19.483989       1 plugins.go:158] Loaded 2 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,MutatingAdmissionWebhook.
I0413 15:15:19.484036       1 plugins.go:161] Loaded 1 validating admission controller(s) successfully in the following order: ValidatingAdmissionWebhook.
I0413 15:15:19.604542       1 run_server.go:69] Running the API server
I0413 15:15:19.604578       1 run_server.go:58] Starting watch extension
W0413 15:15:19.606431       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0413 15:15:19.630055       1 secure_serving.go:210] Serving securely on [::]:5443
I0413 15:15:19.630147       1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::/calico-apiserver-certs/tls.crt::/calico-apiserver-certs/tls.key"
I0413 15:15:19.630257       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0413 15:15:19.630679       1 run_server.go:80] apiserver is ready.
I0413 15:15:19.631104       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0413 15:15:19.631114       1 shared_informer.go:255] Waiting for caches to sync for RequestHeaderAuthRequestController
I0413 15:15:19.631204       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0413 15:15:19.631212       1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0413 15:15:19.631282       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0413 15:15:19.631290       1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0413 15:15:19.732007       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0413 15:15:19.732076       1 shared_informer.go:262] Caches are synced for RequestHeaderAuthRequestController
I0413 15:15:19.732510       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
Version:      v3.25.1
Build date:   2023-03-30T23:52:23+0000
Git tag ref:  v3.25.1
Git commit:   82dadbce1
I0413 15:15:45.802642       1 plugins.go:158] Loaded 2 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,MutatingAdmissionWebhook.
I0413 15:15:45.802806       1 plugins.go:161] Loaded 1 validating admission controller(s) successfully in the following order: ValidatingAdmissionWebhook.
I0413 15:15:45.871553       1 run_server.go:58] Starting watch extension
I0413 15:15:45.871726       1 run_server.go:69] Running the API server
W0413 15:15:45.872885       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0413 15:15:45.885723       1 secure_serving.go:210] Serving securely on [::]:5443
I0413 15:15:45.886356       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0413 15:15:45.886370       1 shared_informer.go:255] Waiting for caches to sync for RequestHeaderAuthRequestController
I0413 15:15:45.886523       1 run_server.go:80] apiserver is ready.
I0413 15:15:45.886549       1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::/calico-apiserver-certs/tls.crt::/calico-apiserver-certs/tls.key"
I0413 15:15:45.886667       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0413 15:15:45.888123       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0413 15:15:45.888133       1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0413 15:15:45.888363       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0413 15:15:45.888375       1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0413 15:15:45.986627       1 shared_informer.go:262] Caches are synced for RequestHeaderAuthRequestController
I0413 15:15:45.988477       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0413 15:15:45.988829       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file

And the tigerastatus apiserver looks like this: ➜ ~ kubectl describe tigerastatus apiserver

Name:         apiserver
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  operator.tigera.io/v1
Kind:         TigeraStatus
Metadata:
  Creation Timestamp:  2023-03-24T16:01:19Z
  Generation:          1
  Managed Fields:
    API Version:  operator.tigera.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
    Manager:      operator
    Operation:    Update
    Time:         2023-03-24T16:01:19Z
    API Version:  operator.tigera.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:conditions:
    Manager:         operator
    Operation:       Update
    Subresource:     status
    Time:            2023-04-13T15:15:56Z
  Resource Version:  ***
  UID:               ***
Spec:
Status:
  Conditions:
    Last Transition Time:  2023-04-06T12:54:24Z
    Message:               All Objects Available
    Observed Generation:   1
    Reason:                AllObjectsAvailable
    Status:                False
    Type:                  Degraded
    Last Transition Time:  2023-04-13T15:15:56Z
    Message:               All objects available
    Observed Generation:   1
    Reason:                AllObjectsAvailable
    Status:                True
    Type:                  Available
    Last Transition Time:  2023-04-13T15:15:56Z
    Message:               All Objects Available
    Observed Generation:   1
    Reason:                AllObjectsAvailable
    Status:                False
    Type:                  Progressing
Events:                    <none>

@caseydavenport Can you maybe point me in the correct direction with this information?

If you are uninstalling apiserver, then you won’t be able to install networkpolicies with helm chart. Working around the problem doesn’t solve the issue.

@caseydavenport I followed this guide https://docs.aws.amazon.com/eks/latest/userguide/calico.html which referenced this doc for tigera operator deployment https://docs.tigera.io/calico/latest/getting-started/kubernetes/helm#install-calico basically deployed the helm chart with kubernetesProvider set to EKS and I also assume that it auto-deploys apiserver out of the box when you deploy the operator using helm chart.

We are faced with same problem when installing tigera-operator helm chart with APIServer enabled on EKS cluster.

I was also facing the same issue, and fixed it for the time being by running

kubectl delete apiserver default

Based on https://docs.tigera.io/calico/latest/operations/install-apiserver#uninstall-the-calico-api-server

Since we are using the default calico helm chart based install, I think the apiserver was getting created, but not configuring it properly perhaps. And since I doubt if we have a need to update the Calico settings from kubectl as part of our use-case, I think it is best to delete it for now. I will also try to find some helm value in the tigera-operator to disable this from start if possible.

PS: I am new to Calico, and please let me know if this is “unsafe” to remove, although the documentation above does not seem to suggest so.

EDIT It is easy to disable the apiServer with the helm values

apiServer:
  enabled: true # Change to false

Also, it seems it is not so important after all - https://docs.tigera.io/calico/latest/reference/architecture/overview#calico-api-server The component architecture says it is only needed to manage calico with kubectl, and I think that would logically mean, it is not used from “within” the cluster.

We are faced with same problem when installing tigera-operator helm chart with APIServer enabled on EKS cluster.

@lucasscheepers I was running into the same issue and was able to get around it by following the Manifest Install directions here: https://docs.tigera.io/calico/latest/operations/install-apiserver

Specifically the patch command fixed the issue: kubectl patch apiservice v3.projectcalico.org -p \ "{\"spec\": {\"caBundle\": \"$(kubectl get secret -n calico-apiserver calico-apiserver-certs -o go-template='{{ index .data "apiserver.crt" }}')\"}}"

@caseydavenport

Thank you for the update. That’s helpful. Since AWS is directly linking to Calico, you might want to update your document with a big red warning that says “Prior to installing Calico, make sure you are correctly configured your YAML”. However, my 2 cents is that the EKS configuration doesn’t seem to be actually required, since Calico worked the first time.

Unfortunately, I am hitting the issue again. I don’t know if something mysteriously got redeployed, or if it just suddenly started happening, or if I was too tired on a Friday to wait for the problem to start happening again. Regardless, here is what I see:

More Diagnostics

The problem E0627 11:56:49.015185 15798 memcache.go:255] couldn't get resource list for projectcalico.org/v3: the server is currently unable to handle the request mysteriously started happening again, and I need to unfortunately spend more time debugging this. This is critical to me, since I need to be able to install Network Policies using Infrastructure As Code, similar to this blog.

I came across this link, which provided some hints: https://github.com/helm/helm/issues/6361

This provide some interesting details:

$ kubectl get apiservice | grep calico
v1.crd.projectcalico.org               Local                         True                           3d23h
v3.projectcalico.org                   calico-apiserver/calico-api   False (FailedDiscoveryCheck)   3d22h

We can see that the API Service has failed discovery check. Digging in more:

$ kubectl get apiservice v3.projectcalico.org -o yaml
...
status:
  conditions:
  - lastTransitionTime: "2023-06-23T17:37:10Z"
    message: 'failing or missing response from https://10.2.192.137:5443/apis/projectcalico.org/v3:
      Get "https://10.2.192.137:5443/apis/projectcalico.org/v3": dial tcp 10.2.192.137:5443:
      i/o timeout'
    reason: FailedDiscoveryCheck
    status: "False"
    type: Available

So now I know where the issue is occurring, I can begin to actually diagnose this problem. We should check to what is going on from the pod end that should be serving the requests:

$ kubectl get pods -n calico-apiserver -o wide
NAME                                READY   STATUS    RESTARTS   AGE     IP             NODE                           NOMINATED NODE   READINESS GATES
calico-apiserver-86cbf6c7fc-5cj2x   1/1     Running   0          9m24s   10.2.192.250   ip-10-2-192-242.ec2.internal   <none>           <none>
calico-apiserver-86cbf6c7fc-zvl85   1/1     Running   0          9m25s   10.2.192.137   ip-10-2-192-156.ec2.internal   <none>           <none>

I ran an ubuntu bastion pod, and from the pod, I curled the API server:

$ curl -k https://10.2.192.137:5443/apis/projectcalico.org/v3
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/apis/projectcalico.org/v3\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403

This is a different error message than the timeout. This indicates that the server is online and responding, but that we are unauthenticated. As a result, this seems like a control plane issue, as as @caseydavenport mentioned before, I could try with hostNetwork: true

I attempted to do this with kubectl edit, but the deployment will not update the pods. I cannot edit the pods directly either.

$ cat << EOF > /tmp/patch
spec:
  template:
    spec:
      hostNetwork: true
EOF

$ kubectl patch deployment calico-apiserver -n calico-apiserver --patch-file /tmp/patch 
deployment.apps/calico-apiserver patched

$ kubectl get deployment -n calico-apiserver -o yaml | grep host
                topologyKey: kubernetes.io/hostname

I then considered that perhaps the Tigera Operator was controlling this calude. I investigated if it was possible to modify this from the helm chart, but it does not seem that this is possible, since it is not located in the documentation.

We are planning to use the VPC CNI plugin soon, however it isn’t installed yet. Therefore, setting hostNetwork: true does seem related to the problem as indicated here. I am not sure how it might be possible to set this. It is however suggested this is possible here.

At this point I’m a little lost as to how this can be fixed. I am still digging though, so I may post another update. I am posting as much as I am so that this is perhaps helpful to someone else who stumbles upon this.

EDIT I’m pretty sure the Operator controls hostNetwork, and it is impossible to configure this. This code suggests that hostNetwork is only set to true if you are configured to run EKS and Calico CNI. And This code suggests that hostNetwork is false by default and not configurable.

Everyone else (if you’re still able to reproduce this issue), could you post kubectl logs for the Calico apiserver pod(s)?

$ kubectl logs calico-apiserver-7fb88d684f-fh5x7 -n calico-apiserver
E0517 13:12:47.384967   36471 memcache.go:287] couldn't get resource list for projectcalico.org/v3: the server is currently 

For me this was some sort of firewall issue. I have configured firewall to be more permissive and I don’t see this issue anymore.