ingress-nginx: Nginx Ingress Timeout Configuration and Persistent 499 and 504 Errors

What happened:

Hey there,

When using Nginx Ingress, I’m experiencing a timeout issue with long response times, causing a 499 error on the Nginx side and a 504 gateway timeout on the client side.

Timeout values configured for Nginx Ingress:

  • proxy-send-timeout: 300
  • proxy-read-timeout: 300
  • proxy-connect-timeout: 300
  • client-body-timeout: 300
  • client-header-timeout: 300
  • upstream-keepalive-timeout: 300
  • keep-alive: 300

Despite setting these timeout values, the following issues persist:

  • Error on Nginx side: 499
  • Error on the client side: 504 Gateway Timeout

I have verified in the controller shell that the timeout values were applied correctly.

I have also restarted the controller pods associated with Nginx Ingress.

What you expected to happen:

Upstream response taking longer than 60 seconds should work.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller Release: v1.4.0 Build: 50be2bf95fd1ef480420e2aa1d6c5c7c138c95ea Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.19.10

Kubernetes version (use kubectl version): 1.23

Environment: EKS

How was the ingress-nginx-controller installed: HELM -> ingress-nginx-4.3.0 1.4.0

Ingress configuration

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
    name:sleep
    annotations:
        nginx.ingress.kubernetes.io/force-ssl-redirect: 'true'
        nginx.ingress.kubernetes.io/proxy-read-timeout: '3600'
spec:
    ingressClassName: nginx
    rules:
        - host: sleep.domain.xyz
          http:
              paths:
                  - path: /
                    pathType: Prefix
                    backend:
                        service:
                            name: sleep
                            port:
                                number: 3000

How to reproduce this issue:

Install nginx-ingress using Helm chart “ingress-nginx/ingress-nginx” version “4.3.0” With the following configuration

    config:
        use-forwarded-headers: 'true'
        large-client-header-buffers: '4 16k'
        proxy-body-size: '20m'
        proxy-send-timeout: 300
        proxy-read-timeout: 300
        proxy-connect-timeout: 300
        client-body-timeout: 300
        client-header-timeout: 300
        upstream-keepalive-timeout: 300
        keep-alive: 300

Create an ingress rule pointing to an HTTP server simulating longer responses time, for example, bryanlatten/docker-dangling-http

Try reaching the HTTP server using a response time of over 60 seconds.

❯ curl https://sleep.domain.xyz/\?d\=65 -v
*   Trying X.X.X.X:443...
* Connected to sleep.domain.xyz (X.X.X.X) port 443 (#0)
* ALPN: offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=domain.xyz
*  start date: Feb 13 00:00:00 2023 GMT
*  expire date: Oct  3 23:59:59 2023 GMT
*  subjectAltName: host "sleep.domain.xyz" matched cert's "*.domain.xyz"
*  issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M02
*  SSL certificate verify ok.
* using HTTP/1.x
> GET /?d=65 HTTP/1.1
> Host: sleep.domain.xyz
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 504 GATEWAY_TIMEOUT
< Content-Length: 0
< Connection: keep-alive
<
* Connection #0 to host sleep.domain.xyz left intact* 

Nginx controller logs should print the following line:

2023-06-19 15:26:11 | log="X.X.X.X - - [19/Jun/2023:06:26:11 +0000] \"GET /?d=65 HTTP/1.1\" 499 0 \"-\" \"curl/7.88.1\" 192 59.075 [namespace-sleep-3000] [] 192.168.211.183:3000 0 59.076 - 4090286b322a6e0ab4df376dd61681ab\n"
-- | --

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 3
  • Comments: 21 (4 by maintainers)

Most upvoted comments

Follow up this ticket https://github.com/kubernetes/ingress-nginx/issues/2007#issuecomment-374856607

Before the change,

kk exec -ti ingress-nginx-controller-7d5fb757db-f66kp -- bash
bash-5.1$ cat /etc/nginx/nginx.conf|grep "proxy.*timeout"|sort -u
			proxy_connect_timeout                   5s;
			proxy_next_upstream_timeout             0;
			proxy_read_timeout                      60s;
			proxy_send_timeout                      60s;

Do the change

  annotations:
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "1800"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "1800"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "1800"

now, I can see the difference

bash-5.1$ cat /etc/nginx/nginx.conf|grep "proxy.*timeout"|sort -u
			proxy_connect_timeout                   1800s;
			proxy_connect_timeout                   5s;
			proxy_next_upstream                     error timeout;
			proxy_next_upstream_timeout             0;
			proxy_read_timeout                      1800s;
			proxy_read_timeout                      60s;
			proxy_send_timeout                      1800s;
			proxy_send_timeout                      60s;

let me run the test.

499 error on the Nginx side and a 504 gateway timeout on the client side => that means there is a hidden proxy level between the Nginx side and client side: client <-> hidden proxy <-> nginx proxy. So editing the Nginx proxy config will not help in this case. I found out that we need to edit the ELB Attributes: Idle timeout (default 60s). By editing the annotations in the .yml file of ingress-nginx likes:

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "600"
    service.beta.kubernetes.io/....

Or, for some other distributes of ingress-nginx deployment, it looked like:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ...
    alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=600
    alb.ingress.kubernetes.io/scheme: ...
    alb.ingress.kubernetes.io/target-type: ...

PS: This config takes some mins or eventually 20mins to be applied. Not sure why, and how to immediately apply it.

Experiencing the same issue.

Experiencing the same issue. Can’t extend the timeout past 1.7m using the same configuration as @Adonisbruno deployed using Helm.

I’m experiencing the same issue. Can’t extend keepalive timeouts beyond 60 seconds.

nginx/1.21.6
Client Version: v1.26.2
Kustomize Version: v4.5.7
Server Version: v1.24.14-gke.1200

I have the same problem, I used the same config to solve, but I changed 300 to 600s.

  • proxy-send-timeout: 600
  • proxy-read-timeout: 600
  • proxy-connect-timeout: 600
  • client-body-timeout: 600
  • client-header-timeout: 600
  • upstream-keepalive-timeout: 600
  • keep-alive: 600
  • use-proxy-protocol: “true”
  • proxy-body-size: “300m”

And using inspect I notice that the connection close after 1.7min, always: image

The nginx log return the same HTTP/2.0" 499

My additionals config by annotations app-side are:

  • nginx.ingress.kubernetes.io/affinity: “cookie”
  • nginx.ingress.kubernetes.io/session-cookie-secure: “true”
  • nginx.ingress.kubernetes.io/session-cookie-samesite: “Lax”
  • nginx.ingress.kubernetes.io/session-cookie-name: “sigscookie”

I’m using Helm Chart v4.2.5. @theophileds you solved?

The problem here was an edge haproxy that was not even aware of its existence. After increasing the client and server timeout in haproxy, the tweaks made in nginx started to work smoothly. I keep haproxy values higher than nginx.