rancher: Clusters imported stuck at "Waiting for API server", with k8s-mode=external or embedded
What kind of request is this (question/bug/enhancement/feature request): Bug
Summary: Clusters imported via Rancher server are forever stuck in the “Waiting for API server” state, when Rancher Server is run in either “external” mode (pointing to a separate k8s cluster) or “embedded” mode.
This regressed between Docker tags 2.0.16 and 2.1.0 - this functionality works with the former and fails with the latter, and has been broken in newer versions.
Steps to reproduce (least amount of steps as possible):
- Easiest way to reproduce
- Deploy Rancher from rancher-embedded-mode.yaml
- Front end it with an ingress, certs, DNS, and log in.
- Import a cluster, then run the generate YAML on the target cluster.
- You’ll see the cluster go from Pending -> Waiting, and logs in both the cattle-agent (imported cluster) side and Rancher side indicating the Websocket connection has been established.
- Nothing else happens after this point, and no more log messages. The cluster is stuck in “Waiting for API server”.
- Reproducer in external mode - just to show that the problem is not limited to embedded mode:
- Drop the base64-encoded KUBECONFIG for some external cluster is k8s-secret.yaml.
- Deploy Rancher from rancher-external-mode.yaml. Rancher will be persisting to the above external cluster.
- Follow same steps as above (the problem is the same).
Result: The imported cluster is stuck in Waiting for API server.
Other details that may be helpful:
- To make the problem go away:
- Rerun either of the above examples using the v2.0.16 image instead of v2.2.7.
- Rerun either of the above examples using
k8s-mode=auto
instead of embedded or external.
Environment information
- Rancher version (
rancher/rancher
/rancher/server
image tag or shown bottom left in the UI): rancher/rancher v2.1.0 and above. - Installation option (single install/HA): single install
gzrancher/rancher#11362
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 19
@anandr781 I just wanted to say thank you for this hint. If anybody arrives at this thread via a google search, here’s what I did as a workaround to get a single node K8S cluster up and running, then connect it to Rancher :
curl https://releases.rancher.com/install-docker/19.03.sh | sudo bash
prior to RKE installResult: Rancher 2.6.3 is working with this K8S cluster.
I just got bitten by this while following the manual/quick start instructions from Rancher documentation.
As per the documentation, I am running Rancher (2.6.3) using docker, on a quite nice machine (10 cores, 64 gig ram, ssd etc) I provisioned an EC2 node on AWS and ran the command generated by Rancher’s custom cluster wizard.
The logs for the cluster report success as
[INFO ] Finished building Kubernetes cluster successfully
The cluster’s state is stuck in
Waiting for API to be available
Is there any way I can work around this while staying with the approach I followed? This is the scenario that helps me most, getting a cluster up an running quickly so that we can experiment with it in the company. I suspect many others like me will go for the lowest hanging fruit. A workaround would help a lot.