istio: Envoy shutting down before the thing it's wrapping can cause failed requests
Hey folks, We’re seeing some brief issues when deploying applications, we’ve narrowed it down to the fact that under a normal shutdown process, both envoy and the application it is wrapping get their SIGTERM at the same time.
Envoy will drain and shut down, but if the other application is still doing anything as part of its shutdown process, and attempts to make an outbound connection AFTER envoy has shut down, it’ll fail because the IPTables rules are sending traffic to a non-existent envoy. For example:
- envoy and application1 get SIGTERM
- envoy drains, there are no connections at the moment so this is almost instant
- application1 tries to tell application2 that it’s going offline, because that’s what it does, and that fails as theres now no envoy
I’ve a rather hacky work around, and that is we add a lifecycle hook to istio-proxy
, which blocks the shutdown until there are no other tcp listeners (thus, half heartedly ensuring our other service has terminated before envoy starts to terminate).
containers:
- name: istio-proxy
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "while [ $(netstat -plunt | grep tcp | grep -v envoy | wc -l | xargs) -ne 0 ]; do sleep 1; done"]
I guess this kind of falls into the lifecycle conversations we’ve been having recently (in https://github.com/istio/istio/issues/6642), and it highlights how we need to give consideration to both startup, and shutdown in order to have non-service impacting deployments when using the default kubernetes RollingUpdate when using Istio.
I’m wondering if you can think of a nicer that this could be accommodated. Basically envoy needs to stay up until the thing its wrapping has exited. A process check would be better still because the above hack only works when application1
is exposing a tcp listener - but istio-proxy
cannot see the process list of the other containers (unless you can think of some way to do this?).
Karl
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 43
- Comments: 74 (50 by maintainers)
Commits related to this issue
- Attempt to remove the lifecycle hook. https://github.com/istio/istio/issues/7136#issuecomment-533033699 seems to indicate that graceful shutdown is supported now. — committed to tcnghia/serving by deleted user 5 years ago
- Start testing using Istio 1.3. (#5604) * Start testing using Istio 1.3. * Attempt to remove the lifecycle hook. https://github.com/istio/istio/issues/7136#issuecomment-533033699 seems to indicate t... — committed to knative/serving by deleted user 5 years ago
if k8s supported container dependencies, it would be a lot easier…
I feel like envoy (or rather the whole proxy) should stay up until all connections it proxies are drained (or the terminationGrace expires). That’d automatically keep it up until the downstream application has shut down itself.
Also related #10112
We handle this in the most hacky way possible… Our
istio-proxy
has apreStop
hook which fires this script:Basically waits for any non envoy ports to stop listening before envoy itself shuts down.
Posting a solution that works for me. Adapted @Stono solution by draining manually inbound connections
I have some feedback to share when using a
preStop
hook with the following code:My k8s deployment has:
terminationGracePeriodSeconds
set to600s
readinessProbe
set to some HTTP pathWhen deploying new pods, old pods receive the
SIGTERM
signal and start to shutdown gracefully. TheterminationGracePeriodSeconds
will keep the pods around in aTerminating
state where we can see that the wrapped container has shutdown but not istio-proxy:The wrapped container (
httpbin
here) is now shut down but thereadiness
probe setup will continue to send HTTP requests through theistio-proxy
resulting in503
response code:I had enough time to ssh into the running
istio-proxy
and run itspreStop
cmd:Looks like
pilot-agent
connection will stay open forever which prevents thepreStop
hook configured onistio-proxy
container to shut down properly the container.I’m currently not sure what’s the best approach to solve this issue.
So I think pod lifecycle have to change to this:
and back
To be clear my original issue is not really an istio problem, more of a kubernetes container ordering problem that’s just exacerbated by istio. As two containers within the same pod will receive the
SIGTERM
at the same time, there is no logic around guaranteeing that pilot-agent doesn’t exit after your app (and stop accepting new outbound connections) which means that if your application has some shutdown logic which involves creating new connections, they can fail if pilot-agent has already exited (because the iptables rules will be routing it to nothing).My work around is more of a kubernetes work around, than an istio one, in that i have a prestop hook block the SIGTERM from getting to pilot-agent until (via a relatively crude check) there are no other ports (owned by my application) listening in the pod - as a signal the app has exited. It is by no means perfect and designed to be a stop gap until a better way to handle the problem surfaces.
As a side note, the second a pod is terminated it enters the
TERMINATING
state which removes it from the service endpoints, and thus eventually consistent downstream envoys sending traffic to it.If you’re getting 10 minutes of 503’s, there’s something here wrong here that is different to my original post, as that pod shouldn’t be getting any traffic at all.
Honestly, so long as
pilot-agent
andenvoy
drain their connections correctly (which I believe they do) and envoy will continue to accept new outbound connections during its shutdown phase (which i’m unsure about), there isn’t much more that istio can do.Hi @Stono
Thanks you very much for your fix - it works as needed for us.
After reading the discussion I due however not fully understand why this shouldn’t be added as default to Istio. I can follow the arguments that it should be Kubernetes which should have hierarchy of it’s containers - however currently it doesn’t.
It is nice that the service mess as a complete abstraction from the application layer. But in cases I can think of I would never wont to have my connections forcefully closed for my applications at shutdown - which is what happen when the sidecar gets killed. Hence I actually think your fix puts me better of than having nothing. So why isn’t this general?
Maybe I miss something and if so I would learn a lot hearing your thoughts 😃
Thanks for the reply @chadlwilson . Right that makes more sense, I will just create a new issue 👍
Perhaps there is some missing information, but your issue sounds rather different than this described? This was about lack of graceful shutdown co-ordination between app container and proxy; causing issues on outgoing calls from app through proxy and which would naturally limit the errors to the k8s layer
terminationGracePeriod
- not 17 minutes? Side note, but outlier detection is not really intended as a mechanism to remove pods that are undergoing a shutdown so this seems a bit strange.If a source pod is still routing requests to a target pod and proxy long after that pod has been destroyed (and its
Endpoint
s removed), which is what it looks like based on your graph from the ingress gateway’s perspective it sounds like you have a different problem to me.Nevertheless, this issue is rather old, and my understanding is that approaches to Envoy draining have changed in this time, so I’m not sure this is the right place for conversation. Perhaps you can raise a new issue and properly report your setup, Istio version, expected behaviour vs observed behaviour in the standard way.
Agreed, this issue has overstayed its welcome 😃.
Karl, do we already have separate issues tracking the specific problems you see?
I actually propose closing off this issue and creating new ones for the specific separate issues that people are reporting on here as it’s become a bit of a dumping ground for any old shut down/drain/eventual consistency issue 😂
It’s over a year old!
@nmittler
drain_listeners
was added by @ramaraochavali very recently (week or so maybe?). I think what we can probably do now is to use that draining rather than our original approach of hot restarting with an empty config. So we get SIGTERM, call/drain_listeners
, wait X seconds, then exit.If preStop doesn’t send any signal to the app, then the preStop hook would do that
We could also just ship a default preStop hook that does that and then not do anything within pilot-agent?
Would have to test all of this out though, this is just based on my understanding which may not be how it actually behaves
@andraxylia why was this moved to 1.4? As far as I know there are no short term fixes for this
This is nice. Although, we would need some extra work to support this with Istio. This would have to drive
pilot-agent
instead of Envoy in Istio cases and we don’t yet have an endpoint to enable this.