fasthttp: Fasthttp behind Aws load balancer. Keepalive conn are causing trouble
Hi!
We’re using a light/fast fasthttp server as a proxy in our services infrastructure. However, we’ve been experiencing some issues when we use an amazon Load Balancer. Sometimes (and this is randomly) the ALB returns 502 because the request can’t find the fasthttp service. Note that ALB uses keepalive connections by default and that can’t be changed.
After a while doing some research, we were suspicious that fasthttp was closing the keepalive connections at some point, and the ALB couldn’t re-use it, so it would return a 502.
If we set the Server.DisableKeepAlive = true everything works as expected (with a lot more of load of course)
We reduced our implementation to the minimum to test:
s := &fasthttp.Server{
Handler: OurHandler,
Concurrency: fasthttp.DefaultConcurrency,
}
s.DisableKeepalive = true // If this is false, we see the error randomly.
log.Fatal(s.ListenAndServe(":" + strconv.Itoa(port)))
The handler basically does this:
// h is an instance of *fasthttp.HostClient configured with some parameters
if err := h.proxy.Do(req, resp); err != nil {
log.Error("error when proxying the request: ", err)
}
Is there any chance someone has experienced this? I’m not sure how we should proceed with the keepalive connections in the fasthttp.Server, as we are using pretty much all the default parameters.
Thanks in advance!
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 42 (1 by maintainers)
You might also want to try my fork of fasthttp which is actually being maintained (this original version is not maintained anymore): https://github.com/erikdubbelboer/fasthttp
@erikdubbelboer thanks for the quick response. Yes I remove the Connection header before doing the request to the proxied service, and before sending back the response to the client (ELB in this case).
Like this:
That’s going to be my next test, use this code behind the ELB. Thanks for the heads up
@Hemant-Mann autoscale might be at 60% but insividual machines might still be at 100%. We had the same issue in the past. Can you check if around the error time any machines are at 100%?
@erikdubbelboer have a good trip 😉