cluster-api-provider-azure: Enhancement Proposal: Break out cloud resource reconciliation

⚠️ Cluster API Azure maintainers can ask to turn an issue-proposal into a CAEP when necessary. This is to be expected for large changes that impact multiple components, breaking changes, or new large features.

Goals

  1. Separate concerns for cloud resource reconciliation from transformation of Cluster API custom resource specs to cloud resource specs
  2. Create parallel path for reconciliation in CAPZ which uses an Azure cloud resource controller / reconciler with Azure CRDs for the cloud resources required for Cluster-API resources, rather than using direct calls to Azure in CAPZ.

Non-Goals/Future Work

  1. Future work: possibly, choose to only rely on Azure resource CRDs for reconciliation rather than parallel implementations.

User Story

As a CAPZ developer I would like to only need to interact with the API server and Custom Resource Definitions of Azure types rather than having to deal with the impedance of interacting with the Azure APIs. This would enable me to focus more on the core value of CAPZ, transforming Cluster-API specs to Azure resource specs.

Detailed Description

CAPZ currently has 2 major concerns.

  1. Transforming Cluster-API types to Azure types
  2. Imperatively interacting the Azure APIs to manipulate / create / delete the Azure resources

I think it would help to speed development in CAPZ if the two concerns were broken out into separate controller / reconcilers.

  1. CAPZ: responsible for translating from Cluster-API CRDs to Azure CRDs
  2. YTBNC (yet to be named controller): responsible for defining Azure CRDs and reconciling them with Azure

With this enhancement CAPZ could be verified to produce the correct specifications for Azure resources given an input without needing to have a dependency on the Azure API. The vast amount of verification could be done via API server and determining if a given input, Cluster API model, gets translated into the proper Azure resource specifications. At the same time, the concern on reconciling Azure resources could be focused and tested within its own component.

YTBNC (yet to be named controller)

Should have the following characteristics:

  • Define versioned CRDs which describe Azure resources across Azure API versions
  • Be able to reconcile Azure resources and provide status changes / conditions
  • Resources should be able to be applied in any order and converge – goal seeking behavior
    • All of the ordering logic is currently built into CAPZ and is kind of brittle

Example of a ResourceGroup

Currently

      group := resources.Group{
		Location: to.StringPtr(s.Scope.Location()),
		Tags: converters.TagsToMap(infrav1.Build(infrav1.BuildParams{
			ClusterName: s.Scope.Name(),
			Lifecycle:   infrav1.ResourceLifecycleOwned,
			Name:        to.StringPtr(s.Scope.ResourceGroup()),
			Role:        to.StringPtr(infrav1.CommonRoleTagValue),
			Additional:  s.Scope.AdditionalTags(),
		})),
        }

       // Wait on Azure to be done :: **synchronous**
       _, err := s.Client.CreateOrUpdate(ctx, s.Scope.ResourceGroup(), group)

Azure Custom Resource

	rg = resources.ResourceGroup{
		ObjectMeta: v1.ObjectMeta{
			Namespace: nn.Namespace,
			Name: nn.Name,
		},
		Spec: resources.ResourceGroupSpec{
			Location: s.Scope.Location(),
			Tags: converters.TagsToMap(infrav1.Build(infrav1.BuildParams{
				ClusterName: s.Scope.Name(),
				Lifecycle:   infrav1.ResourceLifecycleOwned,
				Name:        to.StringPtr(s.Scope.ResourceGroup()),
				Role:        to.StringPtr(infrav1.CommonRoleTagValue),
				Additional:  s.Scope.AdditionalTags(),
			})),
		},
	}
       
        // PUT to API Server to create the resource
        // Just proceed and check back for condition
        err := s.Scope.Client.Create(ctx, &rg)

Contract changes [optional] None.

Data model changes [optional] The CAPZ CRDs may want to keep a reference to the underlying Azure custom resources which they own.

/kind proposal

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 6
  • Comments: 22 (20 by maintainers)

Most upvoted comments

One worry is governance of such a project. Having CAPZ take a core / critical dependency on a project outside of K8s doesn’t feel right. As a prototype / optional thing, I think it would be ok, but not as a requirement. What do you think? Am I being too cautious?

I think it largely depends… I don’t think the dependency is really much greater than the one we already have on the various cloud provider SDKs.

That said, I do have concerns with Google’s k8s-config-connector in particular:

  • Not open source and there are no published distribution guidelines/guarantees that would make inclusion in a cluster-api release or downstream user/product consumption safe/allowable
  • No way to affect potential bugs found, other than filing opaque issues

Of course it would be great if the project was underneath the k8s or cncf umbrella to help afford some assurances that we could resolve the above issues using the governance mechanisms in place, but I don’t think that is a strict requirement to solving the problem.

@devigned - i’d love to see an open source equivalent to google’s config connector but for Azure. This is something that people would use outside of capz, especially with gitops. Most of our customers ask for a way to treat their cloud infra in a declarative way like their k8s resources.

AWS had somthing similar to config connector: https://github.com/amazon-archives/aws-service-operator that i believe is being ressurected: https://github.com/aws/aws-service-operator-k8sg

It might be valuable for this YTBNC to be a separate project similar to: https://github.com/GoogleCloudPlatform/k8s-config-connector

This is moving forward; the proposal is merged 🙌, so closing this out. The larger epic for implementation work is here #3402

You’ll need to PR github.com/kubernetes/test-infra to add the kind/proposal label for this repo, similar to the config we have in place for cluster-api here: https://github.com/kubernetes/test-infra/blob/2d574ef6978af8e03d0ac08503ad6af03d1830bc/label_sync/labels.yaml#L1282-L1289