crossplane: Way to block managed resource from being deleted till its dependency deleted
What problem are you facing?
I have two managed resources where one depends on the other. When delete the two resources, one will fail if the other is deleted too early. Right now, it looks the general reconciler handles managed resources deletion independently, which is good. The thing is that wether it could allow provider owner to customize this behavior somehow using finalizer
.
Couple of observations after investigation for a while:
- Some provider implementation are kind of taking over the reconciliation logic by themselves, so that they fully manage the finalizer adding/removing by themselves. e.g.: the storage/container controller in provider-azure. But this will lose all benefits provided by the general managed resource reconciler, e.g. its Connect/Observer/Create/Update/Delete lifecycle.
- Inside the provider, It looks I can handle the finalizer of the managed resource passed in, but it looks the current implementation for the general reconciler assumes there is only one finalizer,
finalizer.managedresource.crossplane.io
, which is added by itself. This may lead to the confusion even multi-finalizer approach is technically feasible. e.g.: after the reconciler remove its finalizer from the managed resource, it will report “Successfully deleted managed resource”.
How could Crossplane help solve your problem?
It looks the scenario itself is something common, but maybe I’m too opinionated:-). It would be great if there’s a nice way to handle it in crossplane core or runtime, but if that’s something not too generic, maybe it makes sense to keep that inside the particular provider at least for now, in which case, I think provider owners may need some guidance on this from the community owner.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 10
- Comments: 25 (16 by maintainers)
To add additional consensus. If you delete a resource that created an AWS load balancer (e.g. release), and an EKS cluster at the same time, if the order of those two resources aren’t deleted correctly, the load balancer becomes stuck and the VPC can’t be deleted until you go into the aws console and manually delete the lb.
/fresh we still have this problem
/fresh
@valorl I do like the idea that EKS should clean up after itself, but unfortunately this only seems to be the most common/painful instance of the problem. I note for example that per https://cloud.google.com/kubernetes-engine/docs/how-to/deleting-a-cluster GKE won’t necessarily clean up LBs or volumes when you delete the cluster.
Regarding the EKS LB issue, I think that’s painful regardless of whether the tool you use has the ability to enforce order/dependencies. It should really be handled by EKS. https://github.com/aws/containers-roadmap/issues/1348
@negz To answer your question
My scenario actually comes from the use of ODLM, which is sort of workload launch abstraction on top of OLM and can be used to provision operators and operands (the actual workloads) either in local or remote cluster, depending on kubeconfig. To use it, I need to create two resources called
OperandRegistry
andOperandConfig
for some preparation, then a third resource calledOperandRequest
to request the actual operators and operands that I want to launch. For uninstall in my case,OperandRegistry
,OperandConfig
,OperandRequest
are all Managed Resources, which will be deleted independently. But sinceOperandRequest
depends on the other two resources, sometimes it will fail to be deleted because its dependencies are deleted too early.Got it, thanks for clarification. I did not realize that the provider-azure storage controllers implementation is out-of-date 😃
I can understand your point. I see there’s an ongoing discussion at https://github.com/crossplane/crossplane/issues/2072, where people were talking about the creation order which still remains open. I think this issue is related, just on the opposite side, i.e. create or install vs. delete or uninstall. I see your suggestion that relies on
ExternalClient
to tolerate eventual consistency. Actually I’ve opened an issue against ODLM team on this. Before that, in order to unblock, I am planning to address it at provider level. So, it does not impact crossplane core or runtime, which is good, and also, it can complete the e2e install/uninstall flow.