calico: CNI plugin: error getting ClusterInformation: connection is unauthorized: Unauthorized

K8S & Calico information

HostOS: RHEL 8.2 K8S: on-premise cluster; version is v1.21.1; “IPVS” mode; IP4/IP6 dual stack; installed using kubespray Calico: version is v3.18.4; non-BGP mode; enabled “IP6” DNAT. Our docker image is built on top of “RHEL ubi:8” We do not setup external ETCD cluster.

“kubectl describe” output

[support@node-cont-1-qa conf]$ kubectl describe pod export-job-job-dp8hb
Name:           export-job-job-dp8hb
Namespace:      pio
Priority:       0
Node:           node-df1-1/10.0.156.180
Start Time:     Wed, 23 Feb 2022 05:57:18 -0800
Labels:         app.kubernetes.io/instance=export-job-job
                controller-uid=5d9f3e4b-e74c-4280-a3be-e31d37e92b84
                job-name=export-job-job
Annotations:    cni.projectcalico.org/podIP:
                cni.projectcalico.org/podIPs:
Status:         Pending
IP:
IPs:            <none>
Controlled By:  Job/export-job-job
Containers:
  export-job-job:
    Container ID:
    Image:         10.0.156.250:5000/img-admf:9.3.0.0B038
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      csh
    Args:
      -c
      source /TT9/configXcp.sh; lis_conf; python2 /etc/pio/APPL/XcdbBackup.py --exportdb --dir /var/tmp; sleep 300
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  512Mi
    Requests:
      cpu:        200m
      memory:     256Mi
    Environment:  <none>
    Mounts:
      /TT9/PIO/9.0.0/RUN/config/APPL/DBConMgr.cnfg from db-conf (rw,path="DBConMgr.cnfg")
      /TT9/PIO/9.0.0/RUN/config/feature_conf.json from feature-conf (rw,path="feature_conf.json")
      /TT9/PIO/9.0.0/RUN/license/license.json from license-conf (rw,path="license.json")
      /etc/pio/APPL/XcdbBackup.py from job-script (rw,path="XcdbBackup.py")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jh7lg (ro)
      /var/tmp from external-pv (rw)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  job-script:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      export-job-script
    Optional:  false
  db-conf:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  db-secret
    Optional:    false
  feature-conf:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      feature
    Optional:  false
  license-conf:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      license
    Optional:  false
  external-pv:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  backup-pvc
    ReadOnly:   false
  kube-api-access-jh7lg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               52m                 default-scheduler  Successfully assigned pio/export-job-job-dp8hb to node-df1-1
  Warning  FailedCreatePodSandBox  52m                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "e46d8d9df11ef97e7e1d8b38ced7efef32e1cb4bfb0aa85809cb3198464b6167" network for pod "export-job-job-dp8hb": networkPlugin cni failed to set up pod "export-job-job-dp8hb_pio" network: connection is unauthorized: Unauthorized, failed to clean up sandbox container "e46d8d9df11ef97e7e1d8b38ced7efef32e1cb4bfb0aa85809cb3198464b6167" network for pod "export-job-job-dp8hb": networkPlugin cni failed to teardown pod "export-job-job-dp8hb_pio" network: error getting ClusterInformation: connection is unauthorized: Unauthorized]
  Normal   SandboxChanged          50m (x10 over 52m)  kubelet            Pod sandbox changed, it will be killed and re-created.

Expected Behavior

Should start POD successfully

Steps to Reproduce

Sorry, the issue happened two times on different K8S cluster in our lab. And I did not keep any logs… Myself want to know to reproduce too.

My initial thought(maybe wrong)

Since “kubectl describe” has “connection is unauthorized”, I searched source code of K8S v1.21.1. K8S code does NOT has it. Then search it in Calico v3.22 (I am using V3.18.4, but there is not be big difference), find that “connection is unauthorized” exist in “libcalico-go/lib/erros/errors.go” . So, looks like the issue is caused by Calico. Then, use “error getting ClusterInformation” as keyword to search in K8S code but cannot find. And search in Calico code, can find it. So, I have confidence to say the issue is 100% related with Calico.

Because “connection is unauthorized” error prompt is related with “type ErrorConnectionUnauthorized struct”, and "ErrorConnectionUnauthorized " is related with cooperation with ETCD, looks like that the issue is communication issue between Calico and ETCD.

By the way, /var/log/calico/cni/ does NOT has anything related with “etcd” during POD start/destroy while I did normal operation.

What I expect:

If possible, can you please tell me 1). Which webpage describes control/data flow between Calico and ETCD 2). log files and location that whole Calico uses 3). Did I miss any debug information

Thanks

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 29 (6 by maintainers)

Most upvoted comments

I just had to kubectl delete pod calico-node-xxxx on the node where the issue was happening. A new Pod was created and the problem is solved.

This just happened to me in an older v1.22.3 cluster, and I’ve noticed that the calico-node pods had an age of 365d. The problem self-resolved after I deleted all calico-node pods and they were recreated. Is there a certificate / token that has a TTL of 1 year and doesn’t get automatically renewed?

kubectl delete pod calico-node-xxxx -n kube-system , A new Pod was created and the problem is solved.

kubectl delete pods --all --all-namespaces fixed my issue

I had a similar issue today and all the pods on my cluster were stuck in Unknown or Terminating status, including the calico-node-xxxx. I ran kubectl delete pod calico-node-xxxx which fixed the calico-node pod, but the other pods were still not ok, so I ran kubectl delete pods --all --all-namespaces to delete ALL the pods and a couple of minutes after the command everything was back up and running well!

clusterVersion: v1.23

Not sure how useful my comment would be, but I encountered this error when i accidentally rebooted one of the nodes in the cluster. The full error is as follows: error killing pod: failed to "KillPodSandbox" for "%some-guid%" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"%some-pod-id%\" network: error getting ClusterInformation: connection is unauthorized: Unauthorized"

The killing was triggered due to disk pressure event being triggered on the node, reasons of which I’m no entirely sure. Lowered imageGC thresholds a bit before, but from my understanding they shouldn’t trigger disk pressure. Maybe I’m wrong.

ps: I also recall a similar situation with an api that constnatly got evicted every couple of days (disk pressure) and it’s evicted pods were never cleaned up. Didn’t really look up into why the pods remained, but maybe they also were supposed to be cleaned up, but never did because of this error.

Run into a similar issue and worked around by NTP synchronization 😃

Encountering the same issue, how to solve it?

k get pod --all-namespaces
NAMESPACE          NAME                                       READY   STATUS             RESTARTS        AGE
calico-apiserver   calico-apiserver-645c75cf84-ffrk9          1/1     Running            0               8m27s
calico-apiserver   calico-apiserver-645c75cf84-qs4vq          1/1     Running            0               8m27s
calico-system      calico-kube-controllers-59b7bbd897-d59ff   1/1     Running            0               14m
calico-system      calico-node-ngsh8                          1/1     Running            0               21m
calico-system      calico-typha-54b78d9586-4xf2v              1/1     Running            0               21m
kube-system        coredns-6d4b75cb6d-cxgj8                   0/1     CrashLoopBackOff   8 (4m37s ago)   45m
kube-system        coredns-6d4b75cb6d-nnmtb                   0/1     CrashLoopBackOff   8 (4m45s ago)   45m
kube-system        etcd-k8s-master                           1/1     Running            1               45m
kube-system        kube-apiserver-k8s-master                 1/1     Running            1               45m
kube-system        kube-controller-manager-k8s-master       1/1     Running            1               45m
kube-system        kube-proxy-8k6fp                           1/1     Running            0               45m
kube-system        kube-scheduler-k8s-master                1/1     Running            1               45m
tigera-operator    tigera-operator-5dc8b759d9-dsxcf           1/1     Running            0               21m

Examining coredns-6d4b75cb6d-cxgj8 will give me this error:

Warning  FailedCreatePodSandBox  19m (x17 over 23m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a2dac06f53fa4d3b7b425592ce34cb21af5cd082edada3c1b77e56aefa2f7fa1": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
  Warning  BackOff                 7s (x93 over 17m)   kubelet            Back-off restarting failed container

Please help

This just happened to me in an older v1.22.3 cluster, and I’ve noticed that the calico-node pods had an age of 365d. The problem self-resolved after I deleted all calico-node pods and they were recreated. Is there a certificate / token that has a TTL of 1 year and doesn’t get automatically renewed?

I had slightly different issue, but restarting calico pod on the node with failed pod and then the failed pod helped. Pod moved to another node after restart. MicroK8s v1.26.0 revision 4390, Calico v3.23.5

Same issue in 1.22 with Calico

Events: Type Reason Age From Message

Normal Scheduled 3m47s default-scheduler Successfully assigned default/pod-with-cm to worker-node01 Warning FailedCreatePodSandBox 3m46s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container “3dcfdb21462e255a8f4059ca8540c8df05863bd6444cb22290133f894840845e” network for pod “pod-with-cm”: networkPlugin cni failed to set up pod “pod-with-cm_default” network: error getting ClusterInformation: connection is unauthorized: Unauthorized, failed to clean up sandbox container “3dcfdb21462e255a8f4059ca8540c8df05863bd6444cb22290133f894840845e” network for pod “pod-with-cm”: networkPlugin cni failed to teardown pod “pod-with-cm_default” network: error getting ClusterInformation: connection is unauthorized: Unauthorized]

did you fix this

Same issue in 1.22 with Calico

Events: Type Reason Age From Message


Normal Scheduled 3m47s default-scheduler Successfully assigned default/pod-with-cm to worker-node01 Warning FailedCreatePodSandBox 3m46s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container “3dcfdb21462e255a8f4059ca8540c8df05863bd6444cb22290133f894840845e” network for pod “pod-with-cm”: networkPlugin cni failed to set up pod “pod-with-cm_default” network: error getting ClusterInformation: connection is unauthorized: Unauthorized, failed to clean up sandbox container “3dcfdb21462e255a8f4059ca8540c8df05863bd6444cb22290133f894840845e” network for pod “pod-with-cm”: networkPlugin cni failed to teardown pod “pod-with-cm_default” network: error getting ClusterInformation: connection is unauthorized: Unauthorized]

hit the same issue, and lbogdan’s workaround fixed it for me.