strimzi-kafka-operator: [Bug]: Deployment of kafka resource via flux surpasses healthchecking
Bug Description
When deploying kafka via the CRD backed by the operator through Flux i observe an issue whereby the healthCheck capability of Flux which can be instructed to check for a resource being “Ready” through the use of kstatus marks the flux kustomization as applied within a few seconds.
Previously, when i deployed kafka through a CD pipeline, i had a secondary step to run kubectl wait -n kafka --timeout=15m --for=condition=Ready=True Kafka kafka. This ensures that even though my helm upgrade applied, the pipeline would wait for kafka to actually be ready. This worked fine but with flux, kstatus passes instantly.
Steps to reproduce
- Deploy
strimzi-kafka-operator - Deploy a
Kafkavia flux
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: kafka
namespace: flux-system
spec:
dependsOn:
- name: strimzi-kafka-operator
interval: 1h
retryInterval: 10m
timeout: 20m
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/overlays/dev/kafka
prune: true
wait: true
# This healthChecks is the bit that fails to function as expected
healthChecks:
- apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
name: kafka
namespace: kafka
- Observe
fluxdeployment just moves past the kafka kustomization even though i expect it to take at least 5 minutes to spinup
Expected behavior
I expected Flux to wait for the kafka resource to be Ready but due to the way kstatus handles readiness of a resource, it passed the healthCheck instantly due to a missing observedGeneration.
An example of the CRD applying a default of -1 here: https://github.com/fluxcd/source-controller/blob/f2a1814aea9f96262e3897c71ff0d97ee29603ab/config/crd/bases/source.toolkit.fluxcd.io_buckets.yaml#L132
Strimzi version
main
Kubernetes version
1.27
Installation method
helm chart
Infrastructure
No response
Configuration files and logs
No response
Additional context
No response
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Comments: 18 (9 by maintainers)
Ok. But about the
Creatingstatus withobservedGeneration: 0? Is that fine for Flux? Or does that need to have-1?In any case, if something changes and needs some change in Strimzi, feel free to reopen this. We want to make things work when possible. We just need to be careful to not fix it for one user and break it for another. So we are just careful about it.
I don’t know. I’m still not sure I fully understand what it means, what changes will it require, how will it work and what will be the risk of breaking something else. Its not like you can just edit the CRD YAML.
But we do not create the resource, so we cannot set its status. When you do
kubectl applyyou do not set it either. It is IMHO naturally empty.Sorry, my mistake … it is
kubectl edit kafka <cluster-name> --subresource=statusI think there is never any status by default.
I doublechecked it … when the Kafka cluster is deployed, we publish this status update:
This would be the place the -1 might show up I think. If it helps, it might be feasible to change it and use the -1 here. But it would be great if you can doublecheck it first. Possible way how to do it would be to change things manually?
kubectl edit/status kafka ...tokubectl edit/status kafka ...toI think you have to provide more details about what (if anything) can Strimzi do about this. Also, keep in mind that we cannot break things for other users just to make Flux users happier.