kubernetes: Controller manager stuck in "the object has been modified; please apply your changes to the latest version and try again" even after restart
What happened: In controller manager we sometimes observed logs such as below. This happens to many resource type such as endpoints and replicasets. We thought it’s staled informer cache, but after bouncing controller manager we still see the same logs. The only way to mitigate this is to bounce api-server.
Event(v1.ObjectReference{Kind:"HorizontalPodAutoscaler", Namespace:"cig-prod-apps", Name:"<omitted>", UID:"4593f854-b824-4a9e-8e10-c16d558797b9", APIVersion:"autoscaling/v2beta2", ResourceVersion:"71905040", FieldPath:""}): type: 'Warning' reason: 'FailedUpdateStatus' Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "<omitted>": the object has been modified; please apply your changes to the latest version and try again
kubectl get hpa returns resources with proper resource versions, though.
What you expected to happen: there seems to be some kind of caching in api-server? I think it should not return staled information
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version): observed on 1.15, 1.17, 1.18 - Cloud provider or hardware configuration: AKS
- OS (e.g:
cat /etc/os-release): - Kernel (e.g.
uname -a): - Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 16 (16 by maintainers)
No wait…
71905040 > 71902025
So that seems super strange - it suggests that the watchcache is in front of kube-apiserver. That generally can’t happen, because we update watchcache only based on list/watch from etcd…
Did you do something strange to the etcd? Like restore its state or something?