argo-cd: Applications from ApplicationSet flip rapidly between "Unknown" and "Synchronised"
Checklist:
- I’ve searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
- I’ve included steps to reproduce the bug.
- I’ve pasted the output of
argocd version
.
Describe the bug
Since upgrading from ArgoCD 2.8.4 to 2.9.0 our applications that have been generated via ApplicationSets are constantly flapping multiple times a second between “Synchronised” and “Unknown” in the UI. From what I can tell from diffing the generated application as per https://argo-cd.readthedocs.io/en/stable/operator-manual/reconcile/#finding-resources-to-ignore, the sync status and repoURL under status -> sync is constantly flapping between “” and the desired value. I’ve included the logs further down.
The outcome of this is that argoCD essentially hammers itself with this constant and rapid flipping between states. I’ve included logs from the application controller which illustrates this behaviour.
We used argoCD autopilot to generate our applicationsets last year and I have found removing ignoreDifferences
from the applicationset template spec stops the flapping. I’m not sure if this is expected behaviour, as creating an application directly with ignoreDifferences
configured doesn’t seem to do this.
To Reproduce
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
annotations:
argocd.argoproj.io/sync-wave: "0"
creationTimestamp: null
name: cluster-resources
namespace: argocd
spec:
generators:
- git:
files:
- path: kubernetes/bootstrap/cluster-resources/*.json
repoURL: github.com/***
requeueAfterSeconds: 20
revision: main
template:
metadata: {}
spec:
destination: {}
project: ""
source:
repoURL: ""
syncPolicy:
preserveResourcesOnDeletion: true
template:
metadata:
labels:
app.kubernetes.io/managed-by: argocd-autopilot
app.kubernetes.io/name: cluster-resources-{{name}}
name: cluster-resources-{{name}}
namespace: argocd
spec:
destination:
server: "{{server}}"
# Removing this stops the flapping
ignoreDifferences:
- group: argoproj.io
jsonPointers:
- /status
kind: Application
project: default
source:
path: kubernetes/bootstrap/cluster-resources/{{name}}
repoURL: https://github.com/***
targetRevision: main
syncPolicy:
automated:
allowEmpty: true
selfHeal: true
status: {}
Where the cluster resources directory contains a file called “in-cluster.json”:
{"name":"in-cluster","server":"https://kubernetes.default.svc"}
and a folder called “in-cluster” that contains a namespace definition (argocd-ns.yaml):
apiVersion: v1
kind: Namespace
metadata:
annotations:
argocd.argoproj.io/sync-options: Prune=false
creationTimestamp: null
name: argocd
This isn’t the only applicationset where this is happening, but was the most straight forward reproduction case for us.
Expected behavior
We expect the applications to remain in Synchronised status
Screenshots
Version
argocd version
argocd: v2.9.0+9cf0c69
BuildDate: 2023-11-06T04:43:50Z
GitCommit: 9cf0c69bbe70393db40e5755e34715f30179ee09
GitTreeState: clean
GoVersion: go1.21.3
Compiler: gc
Platform: linux/amd64
Logs
Application controller. This shows that in a second it’s repeatedly going Updated sync status: -> Synced
:
time="2023-11-07T10:42:04Z" level=info msg="Updated sync status: -> Synced" application=cluster-resources-in-cluster dest-namespace= dest-server="https://kubernetes.default.svc" reason=ResourceUpdated type=Normal
time="2023-11-07T10:42:04Z" level=info msg="Update successful" application=argocd/cluster-resources-in-cluster
time="2023-11-07T10:42:04Z" level=debug msg="Requesting app refresh caused by object update" api-version=argoproj.io/v1alpha1 application=argocd/autopilot-bootstrap cluster-name= fields.level=0 kind=Application name=cluster-resources-in-cluster namespace=argocd server="https://kubernetes.default.svc"
time="2023-11-07T10:42:04Z" level=info msg="Reconciliation completed" application=argocd/cluster-resources-in-cluster dedup_ms=0 dest-name= dest-namespace= dest-server="https://kubernetes.default.svc" diff_ms=2 fields.level=3 git_ms=289 health_ms=0 live_ms=0 patch_ms=11 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=316
time="2023-11-07T10:42:04Z" level=info msg="Refreshing app status (controller refresh requested), level (0)" application=argocd/autopilot-bootstrap
time="2023-11-07T10:42:04Z" level=info msg="No status changes. Skipping patch" application=argocd/autopilot-bootstrap
time="2023-11-07T10:42:04Z" level=info msg="Reconciliation completed" application=argocd/autopilot-bootstrap dest-name= dest-namespace=argocd dest-server="https://kubernetes.default.svc" fields.level=0 patch_ms=0 setop_ms=0 time_ms=6
time="2023-11-07T10:42:04Z" level=debug msg="Requesting app refresh caused by object update" api-version=argoproj.io/v1alpha1 application=argocd/autopilot-bootstrap cluster-name= fields.level=0 kind=Application name=cluster-resources-in-cluster namespace=argocd server="https://kubernetes.default.svc"
time="2023-11-07T10:42:04Z" level=info msg="Refreshing app status (spec.source differs), level (3)" application=argocd/cluster-resources-in-cluster
time="2023-11-07T10:42:04Z" level=info msg="Refreshing app status (controller refresh requested), level (0)" application=argocd/autopilot-bootstrap
time="2023-11-07T10:42:04Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: )" application=argocd/cluster-resources-in-cluster
time="2023-11-07T10:42:04Z" level=debug msg="Generating Manifest for source {https://github.com/*** kubernetes/bootstrap/cluster-resources/in-cluster 2.9-speculative-fix nil nil nil nil } revision 2.9-speculative-fix"
time="2023-11-07T10:42:04Z" level=info msg="No status changes. Skipping patch" application=argocd/autopilot-bootstrap
time="2023-11-07T10:42:04Z" level=info msg="Reconciliation completed" application=argocd/autopilot-bootstrap dest-name= dest-namespace=argocd dest-server="https://kubernetes.default.svc" fields.level=0 patch_ms=0 setop_ms=0 time_ms=7
time="2023-11-07T10:42:05Z" level=info msg="getRepoObjs stats" application=argocd/cluster-resources-in-cluster build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=298 unmarshal_ms=297 version_ms=0
time="2023-11-07T10:42:05Z" level=debug msg="Retrieved live manifests" application=argocd/cluster-resources-in-cluster
time="2023-11-07T10:42:05Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/cluster-resources-in-cluster
time="2023-11-07T10:42:05Z" level=info msg="Updated sync status: -> Synced" application=cluster-resources-in-cluster dest-namespace= dest-server="https://kubernetes.default.svc" reason=ResourceUpdated type=Normal
Diff of the application CR. It seems to be rapidly switching between:
sync:
comparedTo:
destination:
server: https://kubernetes.default.svc
ignoreDifferences:
- group: argoproj.io
jsonPointers:
- /status
kind: Application
source:
path: kubernetes/bootstrap/cluster-resources/in-cluster
repoURL: ""
targetRevision: 2.9-speculative-fix
revision: cda88b740ba847be6bb94172834e4b6971099956
status: ""
sync:
comparedTo:
destination:
server: https://kubernetes.default.svc
ignoreDifferences:
- group: argoproj.io
jsonPointers:
- /status
kind: Application
source:
path: kubernetes/bootstrap/cluster-resources/in-cluster
repoURL: https://github.com/***
targetRevision: 2.9-speculative-fix
revision: cda88b740ba847be6bb94172834e4b6971099956
status: Synced
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Reactions: 11
- Comments: 19 (7 by maintainers)
I think we’re seeing this at Intuit, too. Looking into it…
v2.9.1 seems to have fixed this issue for us! Thanks a lot @crenshaw-dev 👍 I also see that the Repo Server is also back to normal as I did see some anomalies there that I initially failed to mention.
I can’t reproduce the issue on release-2.9 now that https://github.com/argoproj/argo-cd/pull/16299 is merged. In the interest of time, I’ll skip the deep-dive into why this bug exists and instead cut 2.9.1. If other weirdness appears, we’ll tackle that as it comes. 😃 Thanks everyone for your patience and help! I’ll post here again when 2.9.1 is out.
We’ve rolled this out too and I agree that it seems to be fixed in 2.9.1. Thanks a lot for picking this up so quickly!
This reproduces the issue in 2.9.0:
@crenshaw-dev do we have an ETA on 2.9.1?Just saw it got released 15 minutes ago, thanks a lot!
With the
status
ignored, the application’s.status.sync.status
flicks fromSynced
to “”. The UI displays the change until the status is restored.It looks like the applicationset controller is blowing away the application status because it’s ignored then the application controller restores it. https://github.com/argoproj/argo-cd/pull/14743 is when the applicationset code was introduced but it starts in 2.9.0.
Interestingly enough, there was some refactoring to that code in https://github.com/argoproj/argo-cd/pull/15965 which was put in v2.9.1. I feel the issue will still be there from my glance but maybe worth a test.