rancher: Deleting Amazon EKS cluster in Rancher 2.6.3 doesn't work
Rancher Server Setup
- Rancher version: 2.6.3
- Installation option (Docker install/Helm Chart): Helm Chart
- Kubernetes Cluster and version: Amazon EKS / 1.21
- Proxy/Cert Details: cert-manager with let’s encrypt staging certificate
Information about the Cluster
- Kubernetes version: 1.21
- Cluster Type: New Amazon EKS cluster provisioned by new Rancher
User Information
- What is the role of the user logged in?: Admin/Cluster
Describe the bug If I delete the newly provisioned Amazon EKS cluster - the cluster is not deleted and I can see the following error in the Rancher pods:
$ stern -n cattle-fleet-local-system -n cattle-fleet-system -n cattle-system --tail 1 .
...
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Stopping cluster agent for c-7st6l
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down rbac.authorization.k8s.io/v1, Kind=RoleBinding workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down rbac.authorization.k8s.io/v1, Kind=Role workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=ResourceQuota workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=Namespace workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=ServiceAccount workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=LimitRange workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down rbac.authorization.k8s.io/v1, Kind=ClusterRole workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=Secret workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down apiregistration.k8s.io/v1, Kind=APIService workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=Node workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=ConfigMap workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] Shutting down /v1, Kind=ConfigMap workers
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] [mgmt-auth-users-controller] Deleting clusterRoleTemplateBinding u-lb2u2o7x6j-admin for user u-lb2u2o7x6j
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] [mgmt-auth-crtb-controller] Deleting roleBinding crb-avoo7cnip5
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] [mgmt-auth-users-controller] Deleting globalRoleBinding grb-49gdg for user u-lb2u2o7x6j
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] [mgmt-auth-crtb-controller] Deleting rolebinding u-lb2u2o7x6j-admin-cluster-owner in namespace p-ckdhf for crtb u-lb2u2o7x6j-admin
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] [mgmt-auth-crtb-controller] Deleting rolebinding u-lb2u2o7x6j-admin-cluster-owner in namespace p-2rs4z for crtb u-lb2u2o7x6j-admin
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:31 [INFO] [mgmt-cluster-rbac-remove] Deleting namespace c-7st6l
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:29:32Z" level=info msg="deleting cluster [c-7st6l]"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:29:32Z" level=info msg="starting node group deletion for config [ruzickap-eks]"
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:32 [ERROR] error syncing 'c-7st6l': handler mgmt-cluster-rbac-remove: clusters.management.cattle.io "c-7st6l" not found, requeuing
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:32 [INFO] [mgmt-auth-users-controller] Deleting clusterRoleTemplateBinding c-7st6l-fleet-default-owner for user u-ydnafvbzey
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:32 [INFO] [mgmt-auth-users-controller] Deleting token u-ydnafvbzey for user u-ydnafvbzey
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:32 [INFO] [mgmt-auth-crtb-controller] Deleting roleBinding crb-laonr7f4r3
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:32 [INFO] [mgmt-auth-crtb-controller] Deleting rolebinding c-7st6l-fleet-default-owner-cluster-owner in namespace p-ckdhf for crtb c-7st6l-fleet-default-owner
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:32 [INFO] [mgmt-auth-crtb-controller] Deleting rolebinding c-7st6l-fleet-default-owner-cluster-owner in namespace p-2rs4z for crtb c-7st6l-fleet-default-owner
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:32 [ERROR] error syncing 'fleet-default/c-7st6l': handler cluster-create: failed to delete c-7st6l/c-7st6l-fleet-default-owner management.cattle.io/v3, Kind=ClusterRoleTemplateBinding for cluster-create fleet-default/c-7st6l: clusterroletemplatebindings.management.cattle.io "c-7st6l-fleet-default-owner" not found, requeuing
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:37 [INFO] [mgmt-auth-crtb-controller] Deleting roleBinding crb-f5nqk6o5bj
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:37 [INFO] [mgmt-auth-crtb-controller] Deleting rolebinding creator-cluster-owner-cluster-owner in namespace p-2rs4z for crtb creator-cluster-owner
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:37 [INFO] [mgmt-auth-crtb-controller] Deleting rolebinding creator-cluster-owner-cluster-owner in namespace p-ckdhf for crtb creator-cluster-owner
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:37 [INFO] [mgmt-project-rbac-remove] Deleting namespace p-ckdhf
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:37 [INFO] [mgmt-project-rbac-remove] Deleting namespace p-2rs4z
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:29:42Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:43 [INFO] [mgmt-auth-prtb-controller] Updating owner label for roleBinding crb-7dctyvj4qz
cattle-system rancher-6799f8f646-x4fs7 rancher 2022/02/02 07:29:43 [INFO] [mgmt-auth-prtb-controller] Deleting roleBinding crb-7dctyvj4qz
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:29:53Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:30:03Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:30:13Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:30:23Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:30:33Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:30:44Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:30:54Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:04Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:15Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:25Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:35Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=info msg="waiting for config [c-7st6l] node groups to delete"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=error msg="error syncing 'cattle-global-data/c-7st6l': handler eks-controller-remove: error deleting nodegroups for config [ruzickap-eks], requeuing"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=info msg="deleting cluster [c-7st6l]"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=info msg="starting node group deletion for config [ruzickap-eks]"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=error msg="error syncing 'cattle-global-data/c-7st6l': handler eks-controller-remove: error deleting nodegroups for config [ruzickap-eks], requeuing"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=info msg="deleting cluster [c-7st6l]"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=info msg="starting node group deletion for config [ruzickap-eks]"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=error msg="error syncing 'cattle-global-data/c-7st6l': handler eks-controller-remove: error deleting nodegroups for config [ruzickap-eks], requeuing"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=info msg="deleting cluster [c-7st6l]"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=info msg="starting node group deletion for config [ruzickap-eks]"
cattle-system eks-config-operator-6bfb5548b7-8gtf4 eks-operator time="2022-02-02T07:31:45Z" level=error msg="error syncing 'cattle-global-data/c-7st6l': handler eks-controller-remove: error deleting nodegroups for config [ruzickap-eks], requeuing"
...
<last 3 lines `error deleting nodegroups for config` repeats all the time>
...
From the AWS console I can see only the node groups were successfully deleted - nothing else 😦
To Reproduce
I’m using terraform to create + destroy the cluster and the details are described here: https://github.com/rancher/terraform-provider-rancher2/issues/858
In short - this is a Terraform cluster definition which can be create, but not deleted properly:
resource "rancher2_cluster" "cluster" {
name = "test-eks"
eks_config_v2 {
cloud_credential_id = rancher2_cloud_credential.rancher_aws_account.id
kubernetes_version = "1.21"
name = "test-eks"
private_access = true
public_access = true
region = "eu-west-2"
node_groups {
instance_type = "t2.medium"
desired_size = 2
max_size = 2
min_size = 2
name = "test-eks-ng"
}
}
}
Result
Amazon EKS cluster provisioned by Rancher can not be deleted successfully.
Expected Result
It should be possible to create / delete EKS clusters…
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15 (6 by maintainers)
For sure this is a bug. It’s not managing well the resources that it creates.
This is right…
But I will call this as a “bug”, because this Terraform should always “wait” for object to be deleted - otherwise it is an issue/bug.
In this case Rancher should contain the “feature” to allow Terraform to get the details about cluster delete state…
You are right, but it is not clear where is the problem.
When I deleted Amazon EKS cluster from Rancher manually using GUI it “immediately” disappear from the Rancher and deleting process it running on the “background” (and usually take about 10 minutes)
The same is happening in case of Terraform, where Terraform deleted the cluster in 4 seconds and deletion process is again happening in background:
The question is where is the problem - Rancher API or in Terraform module.
Terraform module should wait until the cluster is really deleted, but I’m not sure if Rancher API/GUI is written to “wait” for it…