calico: calico-kube-controllers pod stuck in not Ready for 13 min

In a Kubernetes cluster created with Kops, replacing the master node(s) puts the calico-kube-controllers pod in not Ready state. It recovers on its own after about 13 min, which is quite slow. Deleting the pod, creates a new one that becomes ready instantly.

Expected Behavior

calico-kube-controllers should recover much faster than 13 min.

Current Behavior

calico-kube-controllers waits 13 min to recover.

Possible Solution

Simplest generic fix would be to add a liveness probe that automatically restarts the pod.

Steps to Reproduce (for bugs)

Create a simple Kubernetes cluster using Kops v1.17.1, using --networking=calico. This should provide the steps: https://kops.sigs.k8s.io/getting_started/aws/.
Build the cluster:

$ kops update cluster --yes

Validate the cluster:

kops validate cluster --wait 15m

Replace the master node:

kops rolling-update cluster --yes --cloudonly --instance-group master-a --force`

Wait for a new master to be created and check the status of the calico-kube-controllers pod.

kubectl logs -f -n kube-system calico-kube-controllers-76bd59c54c-57j6r
2020-07-06 04:38:41.397 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", ReconcilerPeriod:"5m", CompactionPeriod:"10m", EnabledControllers:"node", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", HealthEnabled:true, SyncNodeLabels:true, DatastoreType:"kubernetes"}
W0706 04:38:41.398065       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2020-07-06 04:38:41.398 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2020-07-06 04:38:41.409 [INFO][1] watchersyncer.go 89: Start called
2020-07-06 04:38:41.409 [INFO][1] main.go 183: Starting status report routine
2020-07-06 04:38:41.409 [INFO][1] main.go 368: Starting controller ControllerType="Node"
2020-07-06 04:38:41.409 [INFO][1] node_controller.go 130: Starting Node controller
2020-07-06 04:38:41.409 [INFO][1] watchersyncer.go 127: Sending status update Status=wait-for-ready
2020-07-06 04:38:41.409 [INFO][1] node_syncer.go 39: Node controller syncer status updated: wait-for-ready
2020-07-06 04:38:41.409 [INFO][1] watchersyncer.go 147: Starting main event processing loop
2020-07-06 04:38:41.416 [INFO][1] watchercache.go 291: Sending synced update ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2020-07-06 04:38:41.416 [INFO][1] watchersyncer.go 127: Sending status update Status=resync
2020-07-06 04:38:41.417 [INFO][1] node_syncer.go 39: Node controller syncer status updated: resync
2020-07-06 04:38:41.417 [INFO][1] watchersyncer.go 209: Received InSync event from one of the watcher caches
2020-07-06 04:38:41.417 [INFO][1] watchersyncer.go 221: All watchers have sync'd data - sending data and final sync
2020-07-06 04:38:41.417 [INFO][1] watchersyncer.go 127: Sending status update Status=in-sync
2020-07-06 04:38:41.417 [INFO][1] node_syncer.go 39: Node controller syncer status updated: in-sync
2020-07-06 04:38:41.509 [INFO][1] node_controller.go 143: Node controller is now running
2020-07-06 04:38:41.509 [INFO][1] ipam.go 45: Synchronizing IPAM data
2020-07-06 04:38:41.541 [INFO][1] ipam.go 190: Node and IPAM data is in sync
2020-07-06 04:39:31.537 [INFO][1] ipam.go 45: Synchronizing IPAM data
2020-07-06 04:39:31.580 [INFO][1] ipam.go 281: Calico Node referenced in IPAM data does not exist error=resource does not exist: Node(ip-10-4-56-126.eu-west-1.compute.internal) with error: nodes "ip-10-4-56-126.eu-west-1.compute.internal" not found
2020-07-06 04:39:31.581 [INFO][1] ipam.go 137: Checking node calicoNode="ip-10-4-56-126.eu-west-1.compute.internal" k8sNode=""
2020-07-06 04:39:31.586 [INFO][1] ipam.go 177: Cleaning up IPAM resources for deleted node calicoNode="ip-10-4-56-126.eu-west-1.compute.internal" k8sNode=""
2020-07-06 04:39:31.586 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.cbee49f2bf9d3ae7c4633561ccab65a8a2390d1e47b39b8b1dc572e47e6261ea'
2020-07-06 04:39:31.603 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.109.77.128/26 host="ip-10-4-56-126.eu-west-1.compute.internal"
2020-07-06 04:39:31.603 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.29600955ba32b51269bf6e9db403a3abb4d854e0f57e8958e038a82f7021a596'
2020-07-06 04:39:31.618 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.109.77.128/26 host="ip-10-4-56-126.eu-west-1.compute.internal"
2020-07-06 04:39:31.618 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.afe9203ddad2fecf4a883fb3a778f8ebd2a9174ba6c0bc291914435ec6c0054d'
2020-07-06 04:39:31.634 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.109.77.128/26 host="ip-10-4-56-126.eu-west-1.compute.internal"
2020-07-06 04:39:31.634 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'ipip-tunnel-addr-ip-10-4-56-126.eu-west-1.compute.internal'
2020-07-06 04:39:31.649 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.109.77.64/26 host="ip-10-4-56-126.eu-west-1.compute.internal"
2020-07-06 04:39:31.649 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.6330af27335563bada4a42b904340abde31da0fdb8b619339e295e3102ef1ddc'
2020-07-06 04:39:31.665 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.109.77.64/26 host="ip-10-4-56-126.eu-west-1.compute.internal"
2020-07-06 04:39:31.665 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.b7d1145f4e32b7fac254f5a38b332ff1304addb7928240109f9473d4bac7e9e1'
2020-07-06 04:39:31.680 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.109.77.64/26 host="ip-10-4-56-126.eu-west-1.compute.internal"
2020-07-06 04:39:31.736 [INFO][1] ipam.go 190: Node and IPAM data is in sync
2020-07-06 09:28:18.489 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:28:18.489 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:28:50.489 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:29:10.489 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:29:10.489 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:29:42.490 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:30:02.490 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:30:02.490 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:30:34.490 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:30:54.491 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:30:54.491 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:31:26.491 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:31:46.491 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:31:46.492 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:32:18.492 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:32:38.492 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:32:38.492 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:33:10.493 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:33:30.493 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:33:30.493 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:34:02.493 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:34:22.494 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:34:22.494 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:34:54.494 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:35:14.494 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:35:14.494 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:35:46.495 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:36:06.495 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:36:06.495 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:36:38.495 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:36:58.496 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:36:58.496 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:37:30.496 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:37:50.497 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:37:50.497 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:38:22.497 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:38:42.497 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:38:42.497 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:39:14.497 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:39:34.498 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:39:34.498 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:40:06.498 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:40:26.499 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:40:26.499 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:40:58.499 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:41:18.499 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:41:18.499 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:41:50.500 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:42:10.500 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:42:10.500 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:42:42.501 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:43:02.501 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:43:02.501 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:43:34.501 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:43:54.502 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:43:54.502 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:44:26.502 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:44:46.502 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:44:46.502 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-07-06 09:45:18.503 [ERROR][1] main.go 234: Failed to reach apiserver error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
W0706 09:45:31.822278       1 reflector.go:299] pkg/mod/k8s.io/client-go@v0.0.0-20191114101535-6c5935290e33/tools/cache/reflector.go:96: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: read tcp 100.106.28.199:53668->100.64.0.1:443: read: no route to host") has prevented the request from succeeding
2020-07-06 09:45:31.822 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: read tcp 100.106.28.199:53668->100.64.0.1:443: read: no route to host
2020-07-06 09:45:31.822 [ERROR][1] main.go 203: Failed to verify datastore error=Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: read tcp 100.106.28.199:53668->100.64.0.1:443: read: no route to host
2020-07-06 09:45:32.826 [INFO][1] ipam.go 45: Synchronizing IPAM data
2020-07-06 09:45:32.844 [INFO][1] ipam.go 281: Calico Node referenced in IPAM data does not exist error=resource does not exist: Node(ip-10-4-54-58.eu-west-1.compute.internal) with error: nodes "ip-10-4-54-58.eu-west-1.compute.internal" not found
2020-07-06 09:45:32.844 [INFO][1] ipam.go 137: Checking node calicoNode="ip-10-4-54-58.eu-west-1.compute.internal" k8sNode=""
2020-07-06 09:45:32.849 [INFO][1] ipam.go 177: Cleaning up IPAM resources for deleted node calicoNode="ip-10-4-54-58.eu-west-1.compute.internal" k8sNode=""
2020-07-06 09:45:32.849 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'ipip-tunnel-addr-ip-10-4-54-58.eu-west-1.compute.internal'
2020-07-06 09:45:32.864 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.106.118.0/26 host="ip-10-4-54-58.eu-west-1.compute.internal"
2020-07-06 09:45:32.864 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.bf8da93299c06bedff227fc91f4ef3e6193776c90f49fdd67ff23c0cbc8b582b'
2020-07-06 09:45:32.879 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.106.118.0/26 host="ip-10-4-54-58.eu-west-1.compute.internal"
2020-07-06 09:45:32.879 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.c6e7449430362a2b07b5a86fcb65302610b647e5f74e8770d50108d60bc2aa33'
2020-07-06 09:45:32.897 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.106.118.0/26 host="ip-10-4-54-58.eu-west-1.compute.internal"
2020-07-06 09:45:32.897 [INFO][1] ipam.go 1166: Releasing all IPs with handle 'k8s-pod-network.dc2c3a527c56bdd6cdd40436d679faa72ed066c3ad2e2f0c1dc4fa712b88d4c9'
2020-07-06 09:45:32.912 [INFO][1] ipam.go 1480: Node doesn't exist, no need to release affinity cidr=100.106.118.0/26 host="ip-10-4-54-58.eu-west-1.compute.internal"
2020-07-06 09:45:32.944 [INFO][1] ipam.go 190: Node and IPAM data is in sync
^C

kubectl describe pod calico-kube-controllers-76bd59c54c-57j6r -n kube-system | grep Events: -A 10
Events:
  Type     Reason     Age                 From                                               Message
  ----     ------     ----                ----                                               -------
  Warning  Unhealthy  37m                 kubelet, ip-10-4-57-46.eu-west-1.compute.internal  Readiness probe failed: Error reaching apiserver: taking a long time to check apiserver; Error verifying datastore: Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
  Warning  Unhealthy  30m (x24 over 41m)  kubelet, ip-10-4-57-46.eu-west-1.compute.internal  Readiness probe failed: Error verifying datastore: Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded; Error reaching apiserver: Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded with http status code: 0
  Warning  Unhealthy  25m (x48 over 41m)  kubelet, ip-10-4-57-46.eu-west-1.compute.internal  Readiness probe failed: Error verifying datastore: Get https://100.64.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded; Error reaching apiserver: taking a long time to check apiserver

Context

Kops validates the cluster based on the status of the kube-system pods. This issue prevents the cluster from being upgraded without manual intervention and also slows it down.

Your Environment

Calico version: 3.13.4
Orchestrator version (e.g. kubernetes, mesos, rkt): Kubernetes 1.17.8
Operating System and version: Ubuntu 20.04
Link to your project (optional): https://github.com/kubernetes/kops

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 5
Comments: 22 (15 by maintainers)

Most upvoted comments

This is a cluster with a single master. Happens similarly in a cluster with 3 masters. In that case, all 3 would be in the list.

hakman on Jul 6, 2020

This link does not work https://github.com/projectcalico/libcalico-go/issues/1267

How to go about fixing this @hakman This happened to us whenever our master got replaced

alok87 on Jan 10, 2022

CC: @lwr20

hakman on Jul 6, 2020