kubernetes: Controller manager stuck in "the object has been modified; please apply your changes to the latest version and try again" even after restart

What happened: In controller manager we sometimes observed logs such as below. This happens to many resource type such as endpoints and replicasets. We thought it’s staled informer cache, but after bouncing controller manager we still see the same logs. The only way to mitigate this is to bounce api-server.

Event(v1.ObjectReference{Kind:"HorizontalPodAutoscaler", Namespace:"cig-prod-apps", Name:"<omitted>", UID:"4593f854-b824-4a9e-8e10-c16d558797b9", APIVersion:"autoscaling/v2beta2", ResourceVersion:"71905040", FieldPath:""}): type: 'Warning' reason: 'FailedUpdateStatus' Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "<omitted>": the object has been modified; please apply your changes to the latest version and try again

kubectl get hpa returns resources with proper resource versions, though.

What you expected to happen: there seems to be some kind of caching in api-server? I think it should not return staled information

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): observed on 1.15, 1.17, 1.18
Cloud provider or hardware configuration: AKS
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 16 (16 by maintainers)

Most upvoted comments

No wait…

71905040 > 71902025

So that seems super strange - it suggests that the watchcache is in front of kube-apiserver. That generally can’t happen, because we update watchcache only based on list/watch from etcd…

Did you do something strange to the etcd? Like restore its state or something?

wojtek-t on Oct 30, 2020