kubernetes: Timeouts when draining a node while using external-traffic: OnlyLocal annotation on loadbalancer

/kind bug

What happened:

While trying to achieve zero downtime the current practice seem to be adding a preStop sleep hook (see https://github.com/kubernetes/ingress/issues/322 & https://github.com/kubernetes/kubernetes/issues/43576) to prevent pods being terminated before their endpoints have been removed. This works well except for when it’s combined with a loadbalancer with the service.beta.kubernetes.io/external-traffic: OnlyLocal annotation.

What you expected to happen:

Doing a rolling update of the ingress or draining a node should not cause timeouts.

How to reproduce it (as minimally and precisely as possible):

A minimal setup would be something like an ingress-controller and a backend deployment, both on at least 2 replicas and a preStop hook executing a small sleep. PodDisruptionBudgets with minAvailable: 1. And finally a loadbalancer service.

Execute some form of stress test while you drain a node which hosts an ingress-controller and there shouldn’t be any disruptions.

Now add the OnlyLocal annotation and repeat the test. Timeouts occur.

Environment:

Kubernetes version (use kubectl version): v1.6.4
Cloud provider or hardware configuration**: Azure
OS (e.g. from /etc/os-release): CoreOS 1465.6.0
Kernel (e.g. uname -a): 4.12.7-coreos

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 9
Comments: 15 (6 by maintainers)

Most upvoted comments

We are seeing the same issue. Based on our debugging it looks like:

old pod gets Delete request
goes into Terminating state and process receives Term signal. We catch TERM in our application and start reporting Not Healthy on the configured healthcheck endpoint but the application is still capable of serving traffic.
Happens around the same time as 2 kube-proxy removes all the routing from the node. Now if the external load balancer sends traffic to this node it will be silently dropped.
Some time later external load balancer realises that node is no longer healthy and removes it from circulation.

A nice fix for us would be an ability to configure the time between TERM and kube-proxy removing routing rules. Then after TERM LB will have enough time to realise that node is not healthy before kube-proxy starts blackholing traffic. @thockin I saw you were on a number of discussions around non disruptive rollouts. I was wondering what’s your opinion about this.

kdima on Nov 16, 2017

Still a valid issue. /remove-lifecycle rotten

MrHohn on May 23, 2018

Any news on this?

edevil on Jan 22, 2018