ingress-nginx: Very slow response time

nginx-ingress-controller 0.9.0 beta-8 @ 10 req/s

traefik:latest @ 80 req/s

Not sure why this is but running the same service in kubernetes, traefik blows nginx ingress controller’s performance out of the water. Is this a normal response curve where 90+% of requests are slower then 1.2s? Seems like something is off.

Kubernetes 1.6.4 with weave 1.9.7 (used Kops to create the cluster)

Below is the yaml config file for nginx:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ingress-nginx
  namespace: kube-ingress

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: ingress-nginx
rules:
- apiGroups: ["*"]
  resources: ["*"]
  verbs: ["*"]

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: ingress-nginx
subjects:
- kind: ServiceAccount
  name: ingress-nginx
  namespace: kube-ingress
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ingress-nginx

---
kind: ConfigMap
apiVersion: v1
metadata:
  name: ingress-nginx
  namespace: kube-ingress
  labels:
    k8s-addon: ingress-nginx.addons.k8s.io
data:
  use-proxy-protocol: "true"
  server-tokens: "false"
  secure-backends: "false"
  proxy-body-size: 1m
  use-http2: "false"
  ssl-redirect: "true"

---
kind: Service
apiVersion: v1
metadata:
  name: ingress-nginx
  namespace: kube-ingress
  labels:
    k8s-addon: ingress-nginx.addons.k8s.io
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
spec:
  type: LoadBalancer
  selector:
    app: ingress-nginx
  ports:
  - name: http
    port: 80
    targetPort: http
  - name: https
    port: 443
    targetPort: https
---
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2alpha1
metadata:
  name: ingress-nginx
  namespace: kube-ingress
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: ingress-nginx
  minReplicas: 4
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 50
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 50
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: ingress-nginx
  namespace: kube-ingress
  labels:
    k8s-addon: ingress-nginx.addons.k8s.io
spec:
  replicas: 4
  template:
    metadata:
      labels:
        app: ingress-nginx
        k8s-addon: ingress-nginx.addons.k8s.io
    spec:
      terminationGracePeriodSeconds: 60
      serviceAccountName: ingress-nginx
      containers:
      - image: gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.8
        name: ingress-nginx
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 300m
            memory: 300Mi
        ports:
          - name: http
            containerPort: 80
            protocol: TCP
          - name: https
            containerPort: 443
            protocol: TCP
        readinessProbe:
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          initialDelaySeconds: 5
          timeoutSeconds: 5
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          initialDelaySeconds: 5
          timeoutSeconds: 5
          periodSeconds: 120
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
        args:
        - /nginx-ingress-controller
        - --default-backend-service=$(POD_NAMESPACE)/nginx-default-backend
        - --configmap=$(POD_NAMESPACE)/ingress-nginx
        - --publish-service=$(POD_NAMESPACE)/ingress-nginx

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 3
Comments: 19 (9 by maintainers)

Most upvoted comments

Just an update on this, we have found that there was a problem with weave registering/unregistering IP addresses for nodes when scaling (both up and down) which meant traffic was being routed to nodes that didnt exist any longer which lead to the timeouts and poor performance. I am not sure why Traefik didn’t suffer the same issues however with flannel I have not been able to observe these problems and nginx-ingress-controller has been working very well.

gugahoi on Aug 14, 2017

Create a cluster using kops in us-west

export MASTER_ZONES=us-west-2a
export WORKER_ZONES=us-west-2a,us-west-2b
export KOPS_STATE_STORE=s3://k8s-xxxxxx-01
export AWS_DEFAULT_REGION=us-west-2

kops create cluster \
 --name uswest2-01.xxxxxxx.io \
 --cloud aws \
 --master-zones $MASTER_ZONES \
 --node-count 2 \
 --zones $WORKER_ZONES \
 --master-size m3.medium \
 --node-size m4.large \
 --ssh-public-key ~/.ssh/id_rsa.pub \
 --image coreos.com/CoreOS-stable-1409.5.0-hvm \
 --yes

Create the echoheaders deployment

echo "
apiVersion: v1
kind: Service
metadata:
  name: echoheaders
  labels:
    app: echoheaders
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: http
  selector:
    app: echoheaders

---

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: echoheaders
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: echoheaders
    spec:
      containers:
      - name: echoheaders
        image: gcr.io/google_containers/echoserver:1.4
        ports:
        - containerPort: 8080

---

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: echoheaders-nginx
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
  - host: echoheaders.uswest2-01.xxxxx-xxxx.io
    http:
      paths:
      - backend:
          serviceName: echoheaders
          servicePort: 80
" | kubectl create -f -

Create the nginx ingress controller

$ kubectl create -f https://gist.githubusercontent.com/gugahoi/9f6fbb497899a1a4bca0973be0aad88d/raw/e5943bd71b1b36a68f7f44b7058e3c8f5d3938ab/nginx.yml

Run the tests*

$ echo "GET http://echoheaders.uswest2-01.xxxxxxxx.io" | vegeta attack -duration=10s | tee results.bin | vegeta report
Requests      [total, rate]            500, 50.10
Duration      [total, attack, wait]    10.166760623s, 9.979999939s, 186.760684ms
Latencies     [mean, 50, 95, 99, max]  195.616624ms, 184.568905ms, 263.178239ms, 392.353651ms, 496.3604ms
Bytes In      [total, mean]            283000, 566.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:500  
Error Set:
$ echo "GET http://echoheaders.uswest2-01.xxxxxxxx.io" | vegeta attack -duration=10s | tee results.bin | vegeta report
Requests      [total, rate]            500, 50.10
Duration      [total, attack, wait]    10.166276105s, 9.979999857s, 186.276248ms
Latencies     [mean, 50, 95, 99, max]  215.832082ms, 187.310956ms, 376.512348ms, 441.538481ms, 618.168085ms
Bytes In      [total, mean]            283000, 566.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:500  
Error Set:
$ kubectl scale deployment echoheaders --replicas=2
deployment "echoheaders" scaled
$ echo "GET http://echoheaders.uswest2-01.xxxxxxxx.io" | vegeta attack -duration=10s | tee results.bin | vegeta report
Requests      [total, rate]            500, 50.10
Duration      [total, attack, wait]    10.161263296s, 9.979999715s, 181.263581ms
Latencies     [mean, 50, 95, 99, max]  212.086374ms, 190.977764ms, 375.135607ms, 549.585284ms, 705.742581ms
Bytes In      [total, mean]            283000, 566.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:500  
Error Set:

Delete the cluster

kops delete cluster  --name uswest2-01.xxxxxxxx.io --yes

Important:

The sizes of the nodes are smaller than yours
m4.xlarge provides 4 vCPUs vs 2 in m4.large (this affects the number of nginx workers)
Keep in mind that I’m running vegeta from Chile 😉

aledbf on Jun 30, 2017

@gugahoi can we close this issue?

aledbf on Jun 30, 2017

@treacher the default is kubenet

aledbf on Jun 30, 2017

$ kops version
Version 1.6.2 (git-98ae12a)

aledbf on Jun 30, 2017

@gugahoi what size are the nodes? how many? Can you post information about the application you are testing ? (or repeat the test using the echoheaders image so I can replicate this) Also please post the traefik deployment so I can use the same configuration

What application are you using to execute the test?

aledbf on Jun 29, 2017