kustomize-controller: Checksum label is missing when prune is false

When prune is true, child objects of Kustomizations have these labels:

labels:
  kustomize.toolkit.fluxcd.io/checksum: 21ff4959f5c2052e702115c585ea8cf694c67613
  kustomize.toolkit.fluxcd.io/name: flux-system
  kustomize.toolkit.fluxcd.io/namespace: flux-system

When you set prune to false, the checksum label is dropped:

labels:
  kustomize.toolkit.fluxcd.io/name: flux-system
  kustomize.toolkit.fluxcd.io/namespace: flux-system

Even if the user does not have garbage collection enabled, it is still very useful to have these checksum labels because you can use them to identify stale resources. This would for example be useful in a flux tree view or a web interface like the Flux UI that shows which objects are managed and which are not.

I don’t see any reason not to include these checksums. It should be a relatively safe change for existing Kustomizations when kustomize-controller is updated, but it will cause every object to be patched with new labels.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 22 (20 by maintainers)

Most upvoted comments

For users that have GC disabled this is what currently happens:

  • user: creates a Kubernetes Deployment, Kubernetes Service and a Flux Kustomization in Git
  • kustomize-controller: labels the Deployment and Service and applies them on the cluster
  • kustomize-controller: issues event with deployment created, service created
  • user: updates the Kubernetes Deployment in Git
  • kustomize-controller: issues event with deployment changed

If we merge Jonathan’s PR:

  • user: updates the Kubernetes Deployment in Git
  • kustomize-controller: annotates the Deployment and Service and applies them on the cluster
  • kustomize-controller: issues event with deployment changed, service changed

As you can see with that PR we’ll be sending “bogus” events, the service didn’t changed in Git nor in cluster, but kubectl apply reports a change since the annotation that we generate did changed.

To resolve this issue, instead of relying on kubectl apply output, we should run kubectl diff first, parse the output, remove all objects where only the checksum annotation changed, and use this list for the events. Once we do that, we can drop the checksum label, and use the annotation for GC as well. This would be a major improvement for those that have GC enabled, as they will receive events only for the objects where there is an actual drift.