kubernetes: Service endpoints status is wrong / not updated

What happened: We had a major network issue on our cluster where nodes were not able to contact each others and master(api,scheduler). Services endpoint were cleaned but when the network was back online some services endpoints remains blank while there were pods running for the services, i guess the controller should find these pods and update the service ep accordingly but it was not the case - our solution was to restart pods and the service ep was updated. We are running 1.15.0

What you expected to happen: Service endpoint is updated with PODs IP with correct selectors when a node comes back online after being unreachable

Anything else we need to know?:

kubectl describe ep islanding-redisha-slave -n ee
Name:         islanding-redisha-slave
Namespace:    ee
Labels:       app=islanding-redisha
              chart=islanding-redisha-3.6.0
              heritage=Tiller
              release=islanding-redisha
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2019-09-09T11:08:28Z
Subsets:
  Addresses:          <none>
  NotReadyAddresses:  10.233.26.26
  Ports:
    Name   Port  Protocol
    ----   ----  --------
    redis  6379  TCP

Events:  <none>

 kubectl get po -n ee -owide|grep slave
islanding-redisha-slave-869b4c64c9-7ng49              1/1     Running     0          4d15h   10.233.26.26     dzr-k8s-10   <none>           <none>

It’s listed as NotReady while the PODs is running fine. I still have the case on that PODs/Service for investigation if you need logs. Thanks

Environment:

Kubernetes version (use kubectl version): 1.15.0
Cloud provider or hardware configuration: baremetal
OS (e.g: cat /etc/os-release): debian stretch
Kernel (e.g. uname -a): Linux 4.9.0-7-amd64 #1 SMP Debian 4.9.110-1 (2018-07-05) x86_64 GNU/Linux
Install tools:
Network plugin and version (if this is a network-related bug): flannel 0.11
Others:

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 7
Comments: 25 (17 by maintainers)

Most upvoted comments

We just hit this because of a small time period of network instability and it took down a zone. I think it’s related to the issue mentioned in https://github.com/kubernetes/kubernetes/pull/17741#issuecomment-161129024 when the feature to mark pods as not ready when the node is no longer ready was introduced .

Basically Kubelet expects conditions to be managed by itself, but the NodeController from the control plane marks all pods as not ready once the node times out (by default after 40s). If the network is down long enough, all pods will be rescheduled and thus their conditions are no longer relevant. But if not they are broken since Kubelet would need to resync the state, but Kubelet doesn’t know about the external change. This was supposedly fixed in https://github.com/kubernetes/kubernetes/pull/18410 back in '16, but I suspect that that fix was not complete.

It’s very hard to follow the path that Pod updates take in kubelet, so for the time being I had to stop debugging there.

There is https://github.com/kubernetes/kubernetes/pull/83455 as a possible solution, but I think just resyncing because the last state update is too old is a bit ugly. IMO it would be better to have Kubelet use its watcher for its pods to be notified about state changes and have the reconcilation evaluate the difference between the watcher cache and the expected state instead of using its internal view (which doesn’t look like it gets the state update from the control plane). This would incur minimal overhead and react quickly.

lorenz on Feb 27, 2020