ingress-nginx: Nginx Ingress Timeout Configuration and Persistent 499 and 504 Errors
What happened:
Hey there,
When using Nginx Ingress, I’m experiencing a timeout issue with long response times, causing a 499 error on the Nginx side and a 504 gateway timeout on the client side.
Timeout values configured for Nginx Ingress:
- proxy-send-timeout: 300
- proxy-read-timeout: 300
- proxy-connect-timeout: 300
- client-body-timeout: 300
- client-header-timeout: 300
- upstream-keepalive-timeout: 300
- keep-alive: 300
Despite setting these timeout values, the following issues persist:
- Error on Nginx side: 499
- Error on the client side: 504 Gateway Timeout
I have verified in the controller shell that the timeout values were applied correctly.
I have also restarted the controller pods associated with Nginx Ingress.
What you expected to happen:
Upstream response taking longer than 60 seconds should work.
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
NGINX Ingress controller Release: v1.4.0 Build: 50be2bf95fd1ef480420e2aa1d6c5c7c138c95ea Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.19.10
Kubernetes version (use kubectl version
):
1.23
Environment: EKS
How was the ingress-nginx-controller installed: HELM -> ingress-nginx-4.3.0 1.4.0
Ingress configuration
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name:sleep
annotations:
nginx.ingress.kubernetes.io/force-ssl-redirect: 'true'
nginx.ingress.kubernetes.io/proxy-read-timeout: '3600'
spec:
ingressClassName: nginx
rules:
- host: sleep.domain.xyz
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: sleep
port:
number: 3000
How to reproduce this issue:
Install nginx-ingress using Helm chart “ingress-nginx/ingress-nginx” version “4.3.0” With the following configuration
config:
use-forwarded-headers: 'true'
large-client-header-buffers: '4 16k'
proxy-body-size: '20m'
proxy-send-timeout: 300
proxy-read-timeout: 300
proxy-connect-timeout: 300
client-body-timeout: 300
client-header-timeout: 300
upstream-keepalive-timeout: 300
keep-alive: 300
Create an ingress rule pointing to an HTTP server simulating longer responses time, for example, bryanlatten/docker-dangling-http
Try reaching the HTTP server using a response time of over 60 seconds.
❯ curl https://sleep.domain.xyz/\?d\=65 -v
* Trying X.X.X.X:443...
* Connected to sleep.domain.xyz (X.X.X.X) port 443 (#0)
* ALPN: offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
* subject: CN=domain.xyz
* start date: Feb 13 00:00:00 2023 GMT
* expire date: Oct 3 23:59:59 2023 GMT
* subjectAltName: host "sleep.domain.xyz" matched cert's "*.domain.xyz"
* issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M02
* SSL certificate verify ok.
* using HTTP/1.x
> GET /?d=65 HTTP/1.1
> Host: sleep.domain.xyz
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 504 GATEWAY_TIMEOUT
< Content-Length: 0
< Connection: keep-alive
<
* Connection #0 to host sleep.domain.xyz left intact*
Nginx controller logs should print the following line:
2023-06-19 15:26:11 | log="X.X.X.X - - [19/Jun/2023:06:26:11 +0000] \"GET /?d=65 HTTP/1.1\" 499 0 \"-\" \"curl/7.88.1\" 192 59.075 [namespace-sleep-3000] [] 192.168.211.183:3000 0 59.076 - 4090286b322a6e0ab4df376dd61681ab\n"
-- | --
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 3
- Comments: 21 (4 by maintainers)
Follow up this ticket https://github.com/kubernetes/ingress-nginx/issues/2007#issuecomment-374856607
Before the change,
Do the change
now, I can see the difference
let me run the test.
499 error on the Nginx side and a 504 gateway timeout on the client side
=> that means there is a hidden proxy level between the Nginx side and client side: client <-> hidden proxy <-> nginx proxy. So editing the Nginx proxy config will not help in this case. I found out that we need to edit the ELB Attributes: Idle timeout (default 60s). By editing the annotations in the .yml file of ingress-nginx likes:Or, for some other distributes of ingress-nginx deployment, it looked like:
PS: This config takes some mins or eventually 20mins to be applied. Not sure why, and how to immediately apply it.
Experiencing the same issue.
Experiencing the same issue. Can’t extend the timeout past 1.7m using the same configuration as @Adonisbruno deployed using Helm.
I’m experiencing the same issue. Can’t extend keepalive timeouts beyond 60 seconds.
The problem here was an edge haproxy that was not even aware of its existence. After increasing the client and server timeout in haproxy, the tweaks made in nginx started to work smoothly. I keep haproxy values higher than nginx.
@strongjz Yes.