ingress-nginx: [nginx] Rejected requests during update of ingress deployment
We are using nginx-ingress and hope to get zero downtime deployments. This means that nginx should also be upgraded without any downtime.
Currently we have specified some things in our deployment to ensure that there is always an nginx pod running.
spec:
...
replicas: 2
minReadySeconds: 10
template:
terminationGracePeriodSeconds: 60
containers:
- name: ginx-ingress
...
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
Kubernetes now ensures that there is at least on pod of nginx-ingress running.
But when using tools like siege
or jmeter
we still see a small window where requests get rejected (about half of a second).
Date & Time, Trans, Elap Time, Data Trans, Resp Time, Trans Rate, Throughput, Concurrent, OKAY, Failed
2017-02-22 15:06:36, 1178, 45.18, 0, 0.10, 26.07, 0.00, 2.52, 1178, 2
Is this caused by ingress or do we have something mis configured in our Kubernetes cluster?
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 31 (13 by maintainers)
So here is a final update for everyone who is wondering how to achieve zero downtime deployments with the nginx ingress controller.
You have to delay the termination process of a pod for a significant amount of time.
In the worst case the nginx configuration is only updated every 10 seconds. This means it can take up to ten seconds until a terminating pod to be removed from the upstreams in nginx.
It is as mentioned above pretty easy to achieve this. Simply sleep for 15 seconds in a preStop hook:
Additionally your backend has to do a graceful shutdown. This means it must process all in flight requests and close keep alive connections correctly. (nginx, apache httpd and tomcat for example can handle this properly out of the box).
@metral it’s either the preStop sleep hack or a termination-delaying SIGTERM handler built into the containers in question.
And yes, it applies not only to Ingress controllers but all Pods in general that need to make sure requests are being drained prior to completing the shutdown procedure. Besides the rolling-upgrade use case, this might also be required due to other shutdown-inducing events including auto-scaling, pod evictions, and node cordoning.
Hi guys, thx for having this discussion. Here´s my setup (sorry for not strictly following ingress although this issue is about ingress):
With other attempts, I always had ~500 requests failing (out of 200k, minimal but still with UX impact in a real-world scenario).
My assumption (different from the docs): due to the distributed nature of involved k8s components (kube-proxy, service endpoints, controller mgr, API server, etcd, etc.), I´m hitting async issues causing the draining “flow” to break. If you could try to replicate this with an http hammer, that would be great.
My setup might be different due to directly exposing go http.server with shutdown instead of using nginx.
[UPDATE] I just played with the (pre-stop/ ready) handlers as well as timing (at least 10s) a little bit and did several tests (rolling update to new image, scale out, scale in, rollback). As you already said, it (all scenarios with hey 200k requests, 0% loss) works by just delaying the prestop hook (10s in my safety case) w/out having to fail ready probe.
@timoreimann I was also little bit confused by the various pod states during my tests. But the docs are right and this is the flow:
@caseylucas In my demo lab, I had to handle the delay (10s) in the handler since I´m running FROM scratch 😃
Thx again for your feedback!
@embano1 @timoreimann Back then, when I implemented zero downtime in our deployments, I aimed for a minimal solution. The “sleep 15” preStop hook is the minimalistic solution I found. When a container is terminated it is flagged as “terminating”. This indication leads to a removal of the endpoint in the ingress controller. So there is really no readiness probe required.
In my opinion the readiness probe should be used to indicate if the application is ready to serve requests or if it temporarily can’t serve new requests.
@embano1 appreciate the in-depth double-check. 👏 I’ll go out on a limb and say this could be one of the most detailed and recent verifications we have on the subject as of today. 🎉
The thing I was kinda hoping Kubernetes to provide for me though was that you could have a single container inside a pod responsible for postponing the TERM signal through a pre-stop hook for all other containers inside the pod. That way, you wouldn’t need to insert the delay/sleep in the primary application container but could move the responsibility to an auxiliary side-car, making things further decoupled. I ran a few tests myself to make sure it’s really not possible, which unfortunately does not seem to be the case.
For containers coming with some minimalistic shell environment, this might not be a biggie as you can always do a
sleep <seconds>
.FROM scratch
containers (i.e., most of Go) don’t have this (rather undesirable) luxury, however.@simonklb Great to hear that others are also using the sleep solution successfully. 😃
@embano1 agree with you that the docs are misleading: the part you are citing should be focused on containers instead of pods. This misalignment is exactly what sparked my hope yesterday for a pod-centric pre-stop handler only be the crushed by reality. 😄
I’ll try to get around to filing a PR that makes the docs more specific.
Thanks! 👏
@aledbf can we mention the info provided by @foxylion somewhere in docs?
Current Nginx Ingress Controller docs have section “Why endpoints and not services”, but it’s not obvious (especially for Kubernetes noobs) that this leads to the problem that rolling restart of the Pods does not work as expected out-of-the-box.
Okay, I found a solution for this particular problem. Delaying the termination of the controller pod for a second will not result in any lost requests. But I think this may be general thing in Kubernetes. If someone wan’t to try this too, add this to the deployment configuration:
@philipbjorge i implemented this (also in go) without the sidecar just by using a http pre-stop hook and a http readiness probe. of course, the server process must handle the sigterm gracefully so it does not drop connections. easy these days with http.shutdown.
this combination will definitely work (if you correctly set the timings for pre-stop and probes) for scale-ins (scale down replicaset) as well as rolling updates (even with max unavailable !=0).
the flow goes like this:
basic code example below:
cc/ @timoreimann