kops: DrainAndValidate rolling-update hangs if pods won't evict

I noticed a few instances where if a pod is hung in ContainerCreating state, or some other state and won’t go into Evicted state Kops hangs forever waiting for it during a rolling-update.

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 2
Comments: 33 (20 by maintainers)

Commits related to this issue

using minAvailable=0 in nginx for ability to drain - see https://github.com/kubernetes/kops/issues/2537 — committed to odavid/ansible-role-k8s-aws-cluster by odavid 6 years ago

Most upvoted comments

I had the same here, also with nginx-ingress, but the issue was revealed by addding verbosity to the rolling update: kops rolling-update cluster --yes -v 10

I then saw:

I0622 10:20:46.881051   15660 request.go:873] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Cannot evict pod as it would violate the pod's disruption budget.","reason":"TooManyRequests","details":{"causes":[{"reason":"DisruptionBudget","message":"The disruption budget qa-nginx-ingress-controller needs 1 healthy pods and has 1 currently"}]},"code":429}

So really kops was doing the right thing, just not being very chatty about it. I just had to have more than 1 replica and then decrease minAvailable in the disruption budget and the rolling-update carried on as soon as the additional pod was healthy.

mf-lit on Aug 16, 2018

I agree with making this important, spent some time wondering why my node replacement was stuck. Someone in slack said it was likely to be waiting for pods with disruption budgets and that was the case.

montyz on Aug 6, 2018

Same issue here, this is the second time kops rolling-update --yes waits for drain until I manually kill the nginx-ingress default-backend pod.

SharpEdgeMarshall on Jun 13, 2018

I’ve noticed similar behaviour, I’m using nginx-ingress deployed with/as HPA.

I had to manually kill all nginx-ingress related pods, additionally I also had to kill kube-flannel pod on the drained node.

michalschott on May 11, 2018

Having thought about this a little more, one of the gotchas of that chart (and many helm charts) is that the ReplicaCount defaults to 1 and the PDB MinAvailable also defaults to 1. This is perfectly reasonable, but means that it impossible to evict the pod.

I think (but haven’t tested) it would be better to have minAvailable set to 0 when ReplicaCount is 1, which I think is equivalent to not having a PDB at all, which makes more sense with a single replica.

EDIT: Ah, I see someone has already brought this up: https://github.com/helm/charts/issues/7127

mf-lit on Aug 16, 2018

@inodb That helm chart has what you need:

Either set the replica count with this value: https://github.com/helm/charts/blob/master/stable/nginx-ingress/templates/controller-deployment.yaml#L13 Or if you want to use HPA, set it with these values: https://github.com/helm/charts/blob/master/stable/nginx-ingress/templates/controller-hpa.yaml#L18-L19

And then make sure your PDB is set to a value appropriately lower than the Replica count: https://github.com/helm/charts/blob/master/stable/nginx-ingress/templates/controller-poddisruptionbudget.yaml#L17

mf-lit on Aug 16, 2018

Would be very helpful if kops could log when it is waiting for pods with disruption budgets.

olemarkus on Jun 22, 2018

It would be useful to have a flag that ignores PDB after a certain amount of time. This will trip anyone if they have a PDB of 1 with a replica of 1 it will just sit for a long time waiting for something that will never come.

sstarcher on Nov 15, 2019