cluster-api-provider-aws: Fails to create one of three NAT gateways

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.] Create a 3 AZ cluster as part of conformance tests, but AWSCluster reports:

  Normal  SuccessfulCreateNATGateway        7m25s  aws-controller  Created new NAT Gateway "nat-0559e53ef0a233357"
  Normal  SuccessfulCreateNATGateway        5m54s  aws-controller  Created new NAT Gateway "nat-09baa58ba673f971d"
  Normal  SuccessfulCreateNATGateway        2m37s  aws-controller  Created new NAT Gateway "nat-080cd55b42083d439"
  Normal  SuccessfulCreateNATGateway        50s    aws-controller  Created new NAT Gateway "nat-000c63fc65239c676"

VPC console shows a failed VPC with error “Elastic IP address [eipalloc-0bf842e00beae91a4] is already associated”

CAPA itself doesn’t “notice” as it’s not checking for Failed states, just that there is a NAT gateway:

I0618 19:06:09.751959       1 natgateways.go:190] controllers/AWSCluster "msg"="NAT gateway for subnet is now available" "awsCluster"="cluster-ohy1cth5s64jdrgefucu" "cluster"="cluster-ohy1cth5s64jdrgefucu" "namespace"="conformance-tests-hlpse1" "nat-gateway-id"="nat-0559e53ef0a233357" "subnet-id"="subnet-04660efa0e619433a"
I0618 19:11:13.494754       1 natgateways.go:190] controllers/AWSCluster "msg"="NAT gateway for subnet is now available" "awsCluster"="cluster-ohy1cth5s64jdrgefucu" "cluster"="cluster-ohy1cth5s64jdrgefucu" "namespace"="conformance-tests-hlpse1" "nat-gateway-id"="nat-080cd55b42083d439" "subnet-id"="subnet-0eccbea0870a3758d"
I0618 19:13:14.984888       1 natgateways.go:190] controllers/AWSCluster "msg"="NAT gateway for subnet is now available" "awsCluster"="cluster-ohy1cth5s64jdrgefucu" "cluster"="cluster-ohy1cth5s64jdrgefucu" "namespace"="conformance-tests-hlpse1" "nat-gateway-id"="nat-000c63fc65239c676" "subnet-id"="subnet-09fbbe2fc71eb29d3"

What did you expect to happen:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api-provider-aws version: c718d8e8aaf2b5634cf7b233834818ff07d9d613
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (15 by maintainers)

Most upvoted comments

This morning I had an idea for how to refactor the NAT gateway creation call with minimal effort and change to the code. Will investigate.

While the proof of concept fix above works, it’s not very elegant tagging the EIPs with a subnet ID.