cluster-api-provider-aws: SubnetSpec changes in `v1beta2` break existing use-cases

/kind bug /kind api-change

The changes introduced in https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/3748 (aimed at solving server side apply issues with Cluster Class) break existing use-cases around subnets created by CAPA.

The change makes the ID property on SubnetSpec now be a required field where previously this was optional. This change means that configuration to subnets used for CAPA clusters can now only be done if infrastructure is created by the user resulting in CAPA only allowing for either the default subnet layout or a bring-your-own network where the user would be required to create everything (VPC, subnets, etc.) externally before the creation of the AWSCluster CR.

With v1beta1 and earlier it was possible to specify some of the subnet configuration (such as the CIDR block and the AWS tags) but still have CAPA create the resources for you.

For example, we have a situation where we need more control over what resources end up in what subnets and rely on AWS tags and subnet filters to achieve this.

spec:
  network:
    subnets:
      - cidrBlock: 10.0.0.0/24
        availabilityZone: eu-west-1a
        tags:
          subnet-role: control-plane
      - cidrBlock: 10.0.1.0/24
        availabilityZone: eu-west-1b
        tags:
          subnet-role: control-plane
... etc ...

With this approach we can configure CAPA to create all the subnets we need, with the tags we can then filter on, without having to use an external process to create these resources. This is no longer possible with v1beta2.

In the PR it was acknowledged that this was a breaking change and that we’d need to come up with an alternative to solve this problem going forward but unfortunately was forgotten about so I’m opening this issue to have that discussion and try to come up with a solution to both the original issue the PR was fixing and the use cases currently in use that require configuring subnets in CAPA.

This also fits into the wider discussion around networking in CAPA and the improvements / changes desired.

Related:

Environment:

  • Cluster-api-provider-aws version: v2.0.2
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 5
  • Comments: 23 (17 by maintainers)

Most upvoted comments

I wouldn’t go as far as to completely rule out user input, but I sure as heck would recommend not touching unmanaged things AT ALL. There are only problems stemming from that route. I even created a proposal to remove settings any kind of tags for unmanaged content. We document everything that we require, then the user can set them up as they like. There are at least 3 issues because of how capa tries to juggle various tags on unmanaged entities.

I agree with this. If the user is providing their own infrastructure then CAPA should be “read-only”. But I do still think (and think you agree if I understood) that when CAPA is managing the infra it should be possible to configure them to some degree. My example of this is being able to set the CIDRs and AWS tags used for subnets.

Just a thought to add to the pile. It’s my opinion that CAPA should only offer two methods to get a cluster…

  1. you are new to this and have aws secret/key and you want a kube cluster, this is great for small/medium sized orgs who are ok having 1 VPC per cluster and will likely never reach the 5 vpc soft limit amazon sets on new accounts any time soon. in this scenario capa maintains the entire infra lifecycle.
  2. you know everything, and you want to tell capa exactly what it should use, this is great for enterprise orgs who have different teams who manage the different pieces (networking team, storage team) and will need to be told where/when their cluster/nodes can exist. in this scenario capa manages nothing if IDs are provided…

I don’t feel there should be a middle ground where CAPA acts on its own to try and reconcile infrastructure based on partial user input. this will lead to bloating the CAPA API just to avoid using an existing tool specialized in solving this problem. To get people who want to graduate from CAPA maintaining all infra to having specified infra the CAPA repo could maintain a sample terraform module to help migrate off the default CAPA infra.

@Skarlso the idea of adding subnetMutation won’t stop, there will always be something else to mutate and code will fill up with if/else blocks trying to figure out what the user wants and when CAPA assumed wrong the users will just come complaining. CAPA is a consumer of AWS infrastructure, it shouldn’t be in the business of creating/maintaining it, but I can concede that method 1 above lowers the barrier to entry to new users and should be maintained.

/cc @anmolsachan @joeunlog as you both either commented on the PR or raised an issue about this problem

We are currently investigating the use of CAPA and this is a huge blocker. The possibility that it can alter resources it doesn’t manage with the permission set that is currently required is horrifying.