cluster-api-provider-aws: Machine deletions fails if there is no ELB

/kind bug

What steps did you take and what happened: I deleted AWS resources manually and then initiated machine and cluster deletion from CAPA. As a result of this what I saw is delete machine deployments failing continuously because of:

I0814 17:10:39.243788       1 instances.go:67] [machine-actuator]/cluster.k8s.io/v1alpha1/4d95faba9cb7ee388671ac3cef6ee79b39c25f15/bf038fa5/worker-bf038fa5-nodepool-bf038fa5default-6c86565b8rvqdh "level"=2 "msg"="Looking for existing machine instance by tags"  
I0814 17:10:39.288157       1 machine_controller.go:181] Deleting node "ip-10-0-0-20.us-west-2.compute.internal" for machine "worker-bf038fa5-nodepool-bf038fa5default-6c86565b8rvqdh"
E0814 17:10:39.301721       1 machine_controller.go:183] Error deleting node "ip-10-0-0-20.us-west-2.compute.internal" for machine "worker-bf038fa5-nodepool-bf038fa5default-6c86565b8rvqdh": Delete 
https://bf038fa5-apiserver-261840232.us-west-2.elb.amazonaws.com:6443/api/v1/nodes/ip-10-0-0-20.us-west-2.compute.internal
: dial tcp: lookup bf038fa5-apiserver-261840232.us-west-2.elb.amazonaws.com
on 10.96.0.10:53: no such host

What did you expect to happen: What I expected was that deletions would succeed since the resources are NOT FOUND.

Anything else you would like to add: Also, If I delete the ELB, it could be recreated, unless the Cluster is being deleted. That said, it would also mean a different DNS name, which imply that anything referencing the old DNS name would need to be updated, which may not be done automatically. That would include the kubeconfig secret, but more importantly the client config/kubeadm config for all of the existing Machines in the cluster.

Environment:

  • Cluster-api-provider-aws version: v0.3.7
  • Kubernetes version: (use kubectl version): 1.14.1
  • OS (e.g. from /etc/os-release):

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 20 (17 by maintainers)

Most upvoted comments

Today probably not, but when we have control plane management, we’d need to ensure that we handle the etcd membership properly on deletion of a node that is a control plane, which based on the model we are using with cluster-api-upgrade-tool, would require apiserver access.