ingress-nginx: Controller uses wrong labels during shutdown to determine multiple pods running

During an ingress-nginx helm upgrade with multiple replicas the ingress it sometimes removes the loadbalancerIP status from all ingress resources. After a few seconds the loadbalancerIPs are added on the ingresses again. This happens because the leader controller thinks it is the last pod left and it should clean up the ingress status. It only happens if the last pod of the old version is the leader. The controller checks if there are any pods left with the same labels, but since helm adds the ‘helm.sh/chart=ingress-nginx-4.8.0’ and ‘app.kubernetes.io/version: 1.9.0’ labels and those get changed when upgrading the helm chart. Because it finds no other controller pods it thinks it is the last one and needs to clean up the ingress statusses.

What happened:

  1. We have 2 pods of the chart version 4.8.0 and all ingress have an status with loadbalancerIP.
  2. We run the helm upgrade command to upgrade the pods chart 4.8.1.
  3. This creates a new replicaset and will do a rolling upgrade of the pods.
  4. A new pod (chart version 4.8.1) is created and becomes healthy.
  5. One of the old pods (chart version 4.8.0) gets removed by Kubernetes, the leader is still on the old pod.
  6. The second new pod (chart version 4.8.1) is created and becomes healthy.
  7. The last old pod (chart version 4.8.0) gets removed by Kubernetes. Because there are no pods with the same labels anymore it removes the ingress statusses. (this can be seen by running kubectl get ingress and there will be no address for the ingress)
  8. One of the new pods (chart version 4.8.1) is elected as leader and updates the ingress statusses with the loadbalancerIP.

Between the last 2 steps there is no loadbalancerIP on the ingresses. Thus tools that use this information like external-dns will remove the dns entry and the ingressdomain will not be resolvable for a while.

This results in the following (no address on the ingress):

kubectl get ingress
NAME                    CLASS   HOSTS                     ADDRESS         PORTS     AGE
test-ingress     nginx    example.com                    80, 443   90d

What you expected to happen:

We expect the leader controller to see the pods from the new version, and know it is not the last pod. This means it should not clean up the ingress statusses.

  1. We have 2 pods of the chart version 4.8.0 and all ingress have an status with loadbalancerIP.
  2. We run the helm upgrade command to upgrade the pods chart 4.8.1.
  3. This creates a new replicaset and will do a rolling upgrade of the pods.
  4. A new pod (chart version 4.8.1) is created and becomes healthy.
  5. One of the old pods (chart version 4.8.0) gets removed by Kubernetes, the leader is still on the old pod.
  6. The second new pod (chart version 4.8.1) is created and becomes healthy.
  7. The last old pod (chart version 4.8.0) gets removed by Kubernetes. It sees the new pods and knows it is not the last one.
  8. One of the new pods (chart version 4.8.1) is elected as leader and updates the ingress statusses with the loadbalancerIP.

We would expect during the totality of the upgrade:

NAME                    CLASS   HOSTS                     ADDRESS         PORTS     AGE
test-ingress     nginx   *    zz.zz.zz.zz    80, 443   90d

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): v1.9.1

Kubernetes version (use kubectl version): v1.27.3

Environment:

  • Cloud provider or hardware configuration: AKS
  • OS (e.g. from /etc/os-release): AKSCBLMariner-V2gen2-202309.06.0
  • Kernel (e.g. uname -a): 5.15.126.1-1.cm2
  • Install tools: AKS
  • Basic cluster related info:
    • kubectl version
    • kubectl get nodes -o wide
kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"fa3d7990104d7c1f16943a67f11b154b71f6a132", GitTreeState:"clean", BuildDate:"2023-07-19T12:20:54Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"8b6cfe2c7c54ae110e0c2dbcc52b468bc08bf5f6", GitTreeState:"clean", BuildDate:"2023-07-28T22:18:46Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}

kubectl get nodes -o wide
NAME                                 STATUS   ROLES   AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE            KERNEL-VERSION     CONTAINER-RUNTIME
aks-d2adsweu1-38256127-vmss000000    Ready    agent   7d22h   v1.27.3   10.110.17.100   <none>        CBL-Mariner/Linux   5.15.126.1-1.cm2   containerd://1.6.22
aks-d2adsweu1-38256127-vmss000001    Ready    agent   7d22h   v1.27.3   10.110.18.39    <none>        CBL-Mariner/Linux   5.15.126.1-1.cm2   containerd://1.6.22
aks-d2adsweu1-38256127-vmss000002    Ready    agent   5d20h   v1.27.3   10.110.16.10    <none>        CBL-Mariner/Linux   5.15.126.1-1.cm2   containerd://1.6.22
aks-systemweu1-22699295-vmss000000   Ready    agent   8d      v1.27.3   10.110.16.114   <none>        CBL-Mariner/Linux   5.15.126.1-1.cm2   containerd://1.6.22
aks-systemweu1-22699295-vmss000001   Ready    agent   8d      v1.27.3   10.110.16.223   <none>        CBL-Mariner/Linux   5.15.126.1-1.cm2   containerd://1.6.22

How was the ingress-nginx-controller installed:

We use ArgoCD with the ingress-nginx helm chart version 4.8.1 (or 4.8.0 before upgrade).

The following values are used:

controller:
  resources:
    requests:
      memory: 150Mi
    limits:
      memory: 300Mi
  replicaCount: 2
  allowSnippetAnnotations: true
  service:
    externalTrafficPolicy: 'Local'
    external:
      enabled: true

Current State of the controller:

kubectl describe ingressclasses
Name:         nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=zzz-zzz-ingress
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.9.1
              helm.sh/chart=ingress-nginx-4.8.1
Annotations:  argocd.argoproj.io/tracking-id: zzz-zzz-ingress:networking.k8s.io/IngressClass:ingresscontroller/nginx
              ingressclass.kubernetes.io/is-default-class: true
Controller:   k8s.io/ingress-nginx
Events:       <none>

kubectl -n ingresscontroller get all
NAME                                                                  READY   STATUS    RESTARTS   AGE     IP              NODE                                 NOMINATED NODE   READINESS GATES
pod/zzz-zzz-ingress-ingress-nginx-controller-7849497db8gmxmz   1/1     Running   0          5d20h   10.110.16.227   aks-systemweu1-22699295-vmss000001   <none>           <none>
pod/zzz-zzz-ingress-ingress-nginx-controller-7849497db8pjwsf   1/1     Running   0          5d20h   10.110.16.188   aks-systemweu1-22699295-vmss000000   <none>           <none>

NAME                                                                TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                      AGE    SELECTOR
service/zzz-zzz-ingress-ingress-nginx-controller             LoadBalancer   10.0.18.29     zz.zz.zz.zz 80:31057/TCP,443:31848/TCP   386d   app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx
service/zzz-zzz-ingress-ingress-nginx-controller-admission   ClusterIP      10.0.49.26     <none>          443/TCP                      388d   app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx
service/zzz-zzz-ingress-ingress-nginx-controller-metrics     ClusterIP      10.0.155.249   <none>          10254/TCP                    388d   app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx

NAME                                                              READY   UP-TO-DATE   AVAILABLE   AGE    CONTAINERS   IMAGES
  SELECTOR
deployment.apps/zzz-zzz-ingress-ingress-nginx-controller   2/2     2            2           388d   controller   registry/registry.k8s.io/ingress-nginx/controller:v1.9.1   app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx

NAME                                                                         DESIRED   CURRENT   READY   AGE     CONTAINERS   IMAGES
         SELECTOR
replicaset.apps/zzz-zzz-ingress-ingress-nginx-controller-7849497db8   2         2         2       5d20h   controller   registry/registry.k8s.io/ingress-nginx/controller:v1.9.1   app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx,pod-template-hash=7849497db8

kubectl -n ingresscontroller describe po zzz-zzz-ingress-ingress-nginx-controller-7849497db8
Name:                 zzz-zzz-ingress-ingress-nginx-controller-7849497db8gmxmz
Namespace:            ingresscontroller
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      zzz-zzz-ingress-ingress-nginx
Node:                 aks-systemweu1-22699295-vmss000001/10.110.16.223
Start Time:           Wed, 04 Oct 2023 16:48:00 +0200
Labels:               app.kubernetes.io/component=controller
                      app.kubernetes.io/instance=zzz-zzz-ingress
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=ingress-nginx
                      app.kubernetes.io/part-of=ingress-nginx
                      app.kubernetes.io/version=1.9.1
                      helm.sh/chart=ingress-nginx-4.8.1
                      pod-template-hash=7849497db8
Annotations:          kubectl.kubernetes.io/restartedAt: 2023-07-11T13:59:40Z
                      zzz/logging-module: nginx
Status:               Running
SeccompProfile:       RuntimeDefault
IP:                   10.110.16.227
IPs:
  IP:           10.110.16.227
Controlled By:  ReplicaSet/zzz-zzz-ingress-ingress-nginx-controller-7849497db8
Containers:
  controller:
    Container ID:  containerd://be51dd263b59a482b9ce53843d956b1d9ebb2d4a88678fa845cd426f19132c3c
    Image:         registry/registry.k8s.io/ingress-nginx/controller:v1.9.1
    Image ID:      registry/registry.k8s.io/ingress-nginx/controller@sha256:65c804ad254ac378d316919687b782850dd36c1f677f1115a1db29da59376f18
    Ports:         80/TCP, 443/TCP, 10254/TCP, 8443/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --publish-service=$(POD_NAMESPACE)/zzz-zzz-ingress-ingress-nginx-controller
      --election-id=zzz-zzz-ingress-ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/zzz-zzz-ingress-ingress-nginx-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --ingress-class-by-name=true
      --default-ssl-certificate=certmanager/default-public-wildcard-tls-secret
      --enable-ssl-passthrough=false
    State:          Running
      Started:      Wed, 04 Oct 2023 16:48:01 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  350Mi
    Requests:
      cpu:      100m
      memory:   175Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       zzz-zzz-ingress-ingress-nginx-controller-7849497db8gmxmz (v1:metadata.name)
      POD_NAMESPACE:  ingresscontroller (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-stjv4 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  zzz-zzz-ingress-ingress-nginx-admission
    Optional:    false
  kube-api-access-stjv4:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:               zzz/workload=zzz
                              kubernetes.io/os=linux
Tolerations:                  CriticalAddonsOnly op=Exists
                              node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                              node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  kubernetes.io/hostname:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx
                              topology.kubernetes.io/zone:DoNotSchedule when max skew 1 is exceeded for selector app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx
Events:                       <none>

Name:                 zzz-zzz-ingress-ingress-nginx-controller-7849497db8pjwsf
Namespace:            ingresscontroller
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      zzz-zzz-ingress-ingress-nginx
Node:                 aks-systemweu1-22699295-vmss000000/10.110.16.114
Start Time:           Wed, 04 Oct 2023 16:48:11 +0200
Labels:               app.kubernetes.io/component=controller
                      app.kubernetes.io/instance=zzz-zzz-ingress
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=ingress-nginx
                      app.kubernetes.io/part-of=ingress-nginx
                      app.kubernetes.io/version=1.9.1
                      helm.sh/chart=ingress-nginx-4.8.1
                      pod-template-hash=7849497db8
Annotations:          kubectl.kubernetes.io/restartedAt: 2023-07-11T13:59:40Z
                     zzz/logging-module: nginx
Status:               Running
SeccompProfile:       RuntimeDefault
IP:                   10.110.16.188
IPs:
  IP:           10.110.16.188
Controlled By:  ReplicaSet/zzz-zzz-ingress-ingress-nginx-controller-7849497db8
Containers:
  controller:
    Container ID:  containerd://f931f28240cabc99c4cd417125d6330af42d0d49d96188ccf5048181648bf404
    Image:         registry/registry.k8s.io/ingress-nginx/controller:v1.9.1
    Image ID:     registry/registry.k8s.io/ingress-nginx/controller@sha256:65c804ad254ac378d316919687b782850dd36c1f677f1115a1db29da59376f18
    Ports:         80/TCP, 443/TCP, 10254/TCP, 8443/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --publish-service=$(POD_NAMESPACE)/zzz-zzz-ingress-ingress-nginx-controller
      --election-id=zzz-zzz-ingress-ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/zzz-zzz-ingress-ingress-nginx-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --ingress-class-by-name=true
      --default-ssl-certificate=certmanager/default-public-wildcard-tls-secret
      --enable-ssl-passthrough=false
    State:          Running
      Started:      Wed, 04 Oct 2023 16:48:12 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  350Mi
    Requests:
      cpu:      100m
      memory:   175Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       zzz-zzz-ingress-ingress-nginx-controller-7849497db8pjwsf (v1:metadata.name)
      POD_NAMESPACE:  ingresscontroller (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qlpls (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  zzz-zzz-ingress-ingress-nginx-admission
    Optional:    false
  kube-api-access-qlpls:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:              zzz/workload=zzz
                              kubernetes.io/os=linux
Tolerations:                  CriticalAddonsOnly op=Exists
                              node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                              node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  kubernetes.io/hostname:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx
                              topology.kubernetes.io/zone:DoNotSchedule when max skew 1 is exceeded for selector app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx
Events:                       <none>

kubectl -n ingresscontroller describe svc zzz-zzz-ingress-ingress-nginx-controller
Name:                     zzz-zzz-ingress-ingress-nginx-controller
Namespace:                ingresscontroller
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=zzz-zzz-ingress
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.9.1
                          helm.sh/chart=ingress-nginx-4.8.1
Annotations:              argocd.argoproj.io/tracking-id: zzz-zzz-ingress:/Service:ingresscontroller/zzz-zzz-ingress-ingress-nginx-controller
                          service.beta.kubernetes.io/azure-load-balancer-ipv4: zz.zz.zz.zz
                          service.beta.kubernetes.io/azure-load-balancer-resource-group: zzz
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=zzz-zzz-ingress,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.0.18.29
IPs:                      10.0.18.29
LoadBalancer Ingress:     zz.zz.zz.zz
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  31057/TCP
Endpoints:                10.110.16.188:80,10.110.16.227:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  31848/TCP
Endpoints:                10.110.16.188:443,10.110.16.227:443
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     31796
Events:                   <none>

How to reproduce this issue:

Install the ingress-nginx helm chart:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx --set controller.replicaCount=2 --version 4.8.0

Create an ingress and wait until it has an loadbalancerIp:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: test-ingress
spec:
  defaultBackend:
    service:
      name: test
      port:
        number: 80

Upgrade the helm chart:

helm upgrade ingress-nginx ingress-nginx/ingress-nginx --set controller.replicaCount=2 --version 4.8.1

Check if the ingress resource still has an loadbalancerIp during the upgrade:

k get ingress

We would expect:

NAME                    CLASS   HOSTS                     ADDRESS         PORTS     AGE
test-ingress     nginx   *    zz.zz.zz.zz     80, 443   90d

We get:

NAME                    CLASS   HOSTS                     ADDRESS         PORTS     AGE
test-ingress     nginx    *                 80, 443   90d

Since it does not trigger every upgrade these could be repeated in reverse order, it can also be achieved by downgrading the version.

Anything else we need to know:

The logs of the old leader pod show the following error:

I1004 12:29:58.913037       7 status.go:135] "removing value from ingress status" address=[{"ip":"xx.xx.xx.xx"}]

During shutdown this check should be true, but it is not: https://github.com/kubernetes/ingress-nginx/blob/8ce61bdc6761f04d0ce617b9125255e9a147a20c/internal/ingress/status/status.go#L130

In the function it checks the labels on the pod and gets all pods with these exact labels:

app.kubernetes.io/component: controller
app.kubernetes.io/instance: xxx-ingress
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.0
helm.sh/chart: ingress-nginx-4.8.0

But when the version in changed the new pods will have other labels and thus the check fails. It should only check on the labels that do not change during version updates. The selector labels should probably work in this case, since these are also used for the controller service to route traffic to the controller pods. The labels on the pods in the helm chart where changed during this change: https://github.com/kubernetes/ingress-nginx/pull/9732. But since it does not trigger every upgrade and only affects services that use the loadbalancerIP from the ingress resources not everyone will notice it on every upgrade.

About this issue

  • Original URL
  • State: open
  • Created 9 months ago
  • Comments: 20 (7 by maintainers)

Most upvoted comments

Ehm, random shower thought: Why not using the endpoints of the service configured to publish to the Ingress resources?

Like: The load balancer address published to the Ingress resources reconciled by an Ingress NGINX Controller is configured via the --publish-service flag. So in the same way we could just check the endpoints of this service. If the leader is the last one remaining, it’s fine to remove this load balancer address from the reconciled Ingress resources.

No labels, no exceptions involved, just plain Kubernetes concepts. Or am I missing something?

@longwuyuan @strongjz would it be acceptable in the immediate term to modify the isRunningMultiplePods to use the labels

app.kubernetes.io/component: controller
app.kubernetes.io/instance: xxxx-ingress
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx

And in the long term to implement the change suggested by @Gacko?

The current implementation interrupts externaldns integration regularly and I’m surprised more people haven’t noticed this.

If it’s acceptable we can implement this change.

Note that this causes downtime when used with externaldns because externaldns removes dns upon the clearing of the status by the old leader.