rancher: [2.6] Unable to remove cluster

Rancher Server Setup

  • Rancher version: 2.6.0
  • Installation option (Docker install/Helm Chart): Helm chart
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): RKE1
  • Proxy/Cert Details:

Information about the Cluster

  • Kubernetes version: v1.20.0
  • Cluster Type (Local/Downstream): Downstream
    • If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): vSphere

Describe the bug

I delete a downstream cluster and the node was removed from vsphere but now it’s stuck in removal on the UI

To Reproduce

~I’m not sure how to reproduce this as I have not tried it again.~ It is always reproducible. Simply create a vsphere cluster, once its active, delete it. It’ll be stuck forever!

Result

2021/09/05 05:26:50 [INFO] Stopping cluster agent for c-8lh48
2021/09/05 05:26:50 [ERROR] error syncing 'c-5dbcl': handler restrictedAdminsRBACCluster: Index with name by-cluster does not exist, requeuing

Expected Result

The cluster should be removed from the UI

Screenshots

image

Additional context

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 11
  • Comments: 37 (13 by maintainers)

Most upvoted comments

Here’s how to manually delete a cluster. Thanks Brad.

kubectl get clusters.management.cattle.io  # find the cluster you want to delete 
export CLUSTERID="c-xxxxxxxxx" # 
kubectl patch clusters.management.cattle.io $CLUSTERID -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl delete clusters.management.cattle.io $CLUSTERID

The issue is still persistent in Rancher 2.6.5, I mean as a former scientist, always wonder why the QC is so poor for so many IT products, so anti-user, the new 2.6.* GUI is horrendous compare to 2.5.*

Here’s how to manually delete a cluster. Thanks Brad.

kubectl get clusters.management.cattle.io  # find the cluster you want to delete 
export CLUSTERID="c-xxxxxxxxx" # 
kubectl patch clusters.management.cattle.io $CLUSTERID -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl delete clusters.management.cattle.io $CLUSTERID

After running kubectl patch clusters.management.cattle.io $CLUSTERID -p '{"metadata":{"finalizers":[]}}' --type=merge, the cluster no longer shows under kubectl get clusters.management.cattle.io, and the kubectl delete clusters.management.cattle.io $CLUSTERID cant find it.

Cluster still shows under Cluster Management in the UI. Name is replaced with the cluster ID.

image

Had the same problem. I was able to fix it by running these commands:

kubectl -n fleet-default get clusters.provisioning.cattle.io # find the name of the cluster you want to delete
kubectl -n fleet-default patch clusters.provisioning.cattle.io <cluster_name> -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl -n fleet-default delete clusters.provisioning.cattle.io <cluster_name>

And then I had to manually uninstall Rancher from the machines.

Here’s how to manually delete a cluster. Thanks Brad.

kubectl get clusters.management.cattle.io  # find the cluster you want to delete 
export CLUSTERID="c-xxxxxxxxx" # 
kubectl patch clusters.management.cattle.io $CLUSTERID -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl delete clusters.management.cattle.io $CLUSTERID

After running kubectl patch clusters.management.cattle.io $CLUSTERID -p '{"metadata":{"finalizers":[]}}' --type=merge, the cluster no longer shows under kubectl get clusters.management.cattle.io, and the kubectl delete clusters.management.cattle.io $CLUSTERID cant find it.

Cluster still shows under Cluster Management in the UI. Name is replaced with the cluster ID.

image

The steps above were assuming that you’d already tried deleting it from the UI, and it was stuck waiting on the finalizers. Actually deleting it at some point (either before or after removing the finalizers) is definitely a requirement.

@thedadams For 2 above, I remove thecontroller.cattle.io/cluster-agent-controller-cleanup finalizer on clusters.management.catttle.io, then the cluster is completely removed.

i am having a similar problem. My old cluster was called vetpro-prod2 and had an id of c-c16h4

i first tried these commands:

kubectl get clusters.management.cattle.io  # find the cluster you want to delete 
export CLUSTERID="c-xxxxxxxxx" # 
kubectl patch clusters.management.cattle.io $CLUSTERID -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl delete clusters.management.cattle.io $CLUSTERID

However export CLUSTERID command returned no value.

So I tried these commands:

kubectl -n fleet-default get clusters.provisioning.cattle.io # find the name of the cluster you want to delete
kubectl -n fleet-default patch clusters.provisioning.cattle.io <cluster_name> -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl -n fleet-default delete clusters.provisioning.cattle.io <cluster_name>

That worked and the cluster does not show up in the Cluster manager anymore. (the old cluster was called vetpro-prod2) 2022-02-14 at 6 40 AM

HOWEVER it does still show up in the explorer:

2022-02-14 at 6 41 AM

In the docker logs I am seeing many of these errors:

2022/02/14 14:31:28 [ERROR] error syncing 'c-cl6h4/machine-99r5x': handler node-controller: Delete "https://D77xxxxxxxxxxxxx.yl4.us-west-2.eks.amazonaws.com/api/v1/nodes/ip-10-50-98-31.us-west-2.compute.internal?timeout=45s": dial tcp: lookup D77xxxxxxxxxxxx.us-west-2.eks.amazonaws.com on 10.50.0.2:53: no such host, requeuing 2022/02/14 14:43:38 [ERROR] Failed to handling tunnel request from 10.50.47.153:42338: response 400: cluster not found

I am running 2.6.1 in a single docker container on ubuntu. My clusters are all EKS.

We’re also hitting this issue with 2.6.2 which prevents us to migrate to Rancher 2.6.x.

I am also having the same issue with 2.6.2. Previously deleted cluster still shows under Advanced -> Mgmt Clusters

Now, the UI would not let me delete the cluster because delete is not supported

image