kubernetes: FailedToUpdateEndpointSlices Error updating Endpoint Slices for Service

What happened: Hi Folks, after every deployment we see this for about an hour. Seems to be harmless but wondering if this is a bug in v1.17.3 kubectl describe svc my-svc

Events:
  Type     Reason                        Age   From                       Message
  ----     ------                        ----  ----                       -------
  Warning  FailedToUpdateEndpointSlices  35m   endpoint-slice-controller  Error updating Endpoint Slices for Service my-svc/my-app: Error updating my-app-h7q6v EndpointSlice for Service my-svc/my-app: Operation cannot be fulfilled on endpointslices.discovery.k8s.io "my-app-h7q6v": the object has been modified; please apply your changes to the latest version and try again

What you expected to happen: Events: <none>

How to reproduce it (as minimally and precisely as possible): kubectl rollout restart deployment my_deploy

Anything else we need to know?:

Environment: Server Version: version.Info{Major:“1”, Minor:“17”, GitVersion:“v1.17.3”, GitCommit:“06ad960bfd03b39c8310aaf92d1e7c12ce618213”, GitTreeState:“archive”, BuildDate:“2020-03-20T16:41:14Z”, GoVersion:“go1.13.4”, Compiler:“gc”, Platform:“linux/amd64”}

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 6
Comments: 29 (16 by maintainers)

Most upvoted comments

Hey @jijojv, thanks for reporting this! This is not actually anything to worry about and I think the best solution will be for us to stop publishing that event if the error is related to an out of date cache like this. Due to the nature of the controller reacting to changes in Services and attempting to update related EndpointSlices, it can run into problems if the locally cached copy of EndpointSlices it has is out of date. It will naturally retry and resolve the issue when the cache updates. I’ll work on a fix here to lower the logging and see if there are some ways to reduce the probability of this happening.

/remove-triage unresolved

+12

robscott on Jul 9, 2020

I’m currently faced with the same error, but it’s not after every deployment. The error suddenly shows up hours after deployment is done. There are some pods that emit DNS resolution error logs within 10s after the FailedToUpdateEndpointSlices emitted. Example log emitted:

caused by: Post "https://sts.ap-southeast-1.amazonaws.com/": dial tcp: lookup sts.ap-southeast-1.amazonaws.com: i/o timeout

Is this related or a different issue? This is on v1.23 EKS.

aufarg on Jul 24, 2023

I am still getting this error.

smyja on Jul 13, 2023

Hey @ltagliamonte-dd, unfortunately we can no longer patch v1.18 so the mitigation for this only made it back as far as 1.19.

robscott on May 18, 2021

@drawn4427 what version of Kubernetes are you using? For reference, the oldest version of Kubernetes that got this patch was v1.19.9.

robscott on Apr 22, 2021