caddy: lb_try_interval/lb_try_duration do not pick up new backends on config reload
Given the following Caddyfile:
{
auto_https off
}
:80 {
reverse_proxy localhost:8003 {
lb_try_duration 30s
lb_try_interval 1s
}
}
I use caddy run and run a separate server on port 8003 (I’m using datasette -p 8003 here) and it proxies correctly. If I shut down my 8003 server and try to hit http://localhost/ I get the desired behaviour - my browser spins for up to 30s, and if I restart my 8003 server during that time the request is proxied through and returned from the backend.
What I’d really like to be able to do though is to start up a new server on another port (actually in production on another IP/port combination) and have traffic resume against the new server.
So I tried editing the Caddyfile to use localhost:8004 instead, started up my backend on port 8004 then used caddy reload to load in the new configuration… and my request to port 80 continued to spin. It appears Caddy didn’t notice that there was now a new configuration for the backend for this reverse_proxy.
It would be really cool if that lb_try_interval/lb_try_duration feature could respond to updated configurations and seamlessly forward paused traffic to the new backend.
(This started as a Twitter conversation: https://twitter.com/mholt6/status/1463656086360051714)
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 7
- Comments: 18 (7 by maintainers)
Commits related to this issue
- Implement SRV and A/AAA upstream sources Also get upstreams at every retry loop iteration instead of just once before the loop. See #4442. — committed to caddyserver/caddy by mholt 3 years ago
- Implement SRV and A/AAA upstream sources Also get upstreams at every retry loop iteration instead of just once before the loop. See #4442. — committed to caddyserver/caddy by mholt 3 years ago
- reverseproxy: Dynamic upstreams (with SRV and A/AAAA support) (#4470) * reverseproxy: Begin refactor to enable dynamic upstreams Streamed here: https://www.youtube.com/watch?v=hj7yzXb11jU * Imp... — committed to caddyserver/caddy by mholt 2 years ago
@simonw does the Dynamic Upstreams feature solve your usecase? https://caddyserver.com/docs/caddyfile/directives/reverse_proxy#dynamic-upstreams
Also worth noting, we just merged https://github.com/caddyserver/caddy/pull/4756 which adds
lb_retries, i.e. an amount of retries to perform. Probably not necessarily useful for you here, but I wanted to mention it because this issue is related to retries.I think we can probably close this issue now.
I’ve implemented the getting of upstreams “per retry” in #4470. The actual API endpoint to adjust the upstreams specifically will have to come in a future PR.
I wonder if an API endpoint that just adds/removes backends without a config reload could be helpful.
The other piece of this would be we’d have to get the list of upstreams in each iteration of the
forloop instead of just once before theforloop. Or maybe we’d only have to do that if the list of upstreams changed since that iteration started. Hmmm.If you feel like watching Matt talk about it for 50 minutes 😅 he streamed the beginning of the work on refactoring the upstreams logic https://youtu.be/hj7yzXb11jU
tl;dw, right now the upstreams are a static list, but the plan is to make it possible to have a dynamic list of upstreams, and the source could be whatever (you could write a custom module to provide the list on the fly, via SRV, or maybe fetch from HTTP and cache it for a few seconds, I dunno, whatever you like).
Point of note @mholt for this to work though, it would need to fetch a list of upstreams on every retry iteration and not just once before the loop.
Hmm, I don’t think so, because Caddy still needs to load a new config, and the new one won’t take effect while there’s still pending requests. Any config changes change the entire server’s config, it’s not targetted.
The issue is still valid though. You’d want to proxy directly to pods in many scenarios, so you can take advantage of
lb_policyand others.Oh I see what you mean! Yes that’s a fantastic idea, I shall try that.
You can still do the same if you use a service on top of your pods. The only difference is that K8s would be the one resolving the IP & port for you. Caddy can simply proxy to the service address and not knowing which pod.
When you bring down your only pod, the service is effectively down, and caddy can do its normal retry until a new pod is online, i.e. the service is back up.