ingress-nginx: Ingress not working in AKS version 1.24.6 due to health probe change and conflicting server snippet configuration in config map
Description
I recently updated my AKS cluster to version 1.24.6 and noticed that my ingress has stopped working. Upon further investigation, I discovered that the health probe has changed from TCP to HTTP/HTTPS.I added the healthz annotation to my ingress configuration, but it didn’t resolve the issue. I also tried adding a location /healthz to allow all traffic, but this also had no effect.After some more digging, I realized that the problem was with the server snippet in the ingress-nginx ConfigMap that only allowed traffic from Azure Front Door. Removing the server snippet resolved the issue, but I cannot remove it as it creates a security vulnerability.
Expected Behavior
Ingress should be functioning properly after updating the AKS cluster and adding the healthz annotation.
Actual Behavior
Ingress is still not working and traffic is not being allowed through due to a conflicting server snippet configuration.
Kubernetes version
1.24.6
Environment
- Cloud provider or hardware configuration: Azure
- OS : Ubuntu 18.04.6 LTS
- Kernel : 5.4.0-1098-azure
Basic cluster related info:
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-nodepool-####-vmss### Ready agent 23d v1.24.6 ### <none> Ubuntu 18.04.6 LTS 5.4.0-1098-azure containerd://1.6.4+azure-4
How was the ingress-nginx-controller installed
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
ingress-nginx ingress-nginx 1 2023-02-08 10:01:42.549208714 +0000 UTC deployed ingress-nginx-4.4.0 1.5.1
Helm values
helm -n ingress-nginx get values ingress-nginx
USER-SUPPLIED VALUES:
controller:
admissionWebhooks:
enabled: false
patch:
enabled: false
config:
custom-http-errors: 500,503
server-snippet: |
set $err 0;
set $var "";
if ($http_x_azure_fdid !~* "#######" ) {
set $err 1;
}
if ($var ~ "#######") {
set $err 0;
}
if ($host ~* "########") {
set $err 0;
}
if ($err = 1){
return 403;
}
containerSecurityContext:
allowPrivilegeEscalation: false
extraArgs:
default-ssl-certificate: ingress-nginx/default-cert
nodeSelector:
agentpool: nodepool
replicaCount: 2
service:
external:
enabled: false
internal:
annotations:
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-load-balancer-internal-subnet: ########
enabled: true
defaultBackend:
enabled: true
image:
image: #######
registry: ########
tag: 1.0.2
nodeSelector:
agentpool: nodepool
imagePullSecrets:
- name: ##########
Current State of the controller
kubectl describe ingressclasses
Name: nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.5.1
helm.sh/chart=ingress-nginx-4.4.0
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: ingress-nginx
Controller: k8s.io/ingress-nginx
Events: <none>
kubectl -n ingress-nginx describe po ingress-nginx-controller-655b8c4c8c-4cbv8
Name: ingress-nginx-controller-655b8c4c8c-4cbv8
Namespace: ingress-nginx
Priority: 0
Service Account: ingress-nginx
Node: aks-nodepool-21640858-vmss000027/192.168.2.62
Start Time: Fri, 10 Feb 2023 12:03:34 +0530
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/name=ingress-nginx
linkerd.io/control-plane-ns=linkerd
linkerd.io/proxy-deployment=ingress-nginx-controller
linkerd.io/workload-ns=ingress-nginx
pod-template-hash=655b8c4c8c
Annotations: linkerd.io/created-by: linkerd/proxy-injector stable-2.11.1
linkerd.io/identity-mode: default
linkerd.io/proxy-version:
Status: Running
IP: 192.168.2.73
IPs:
IP: 192.168.2.73
Controlled By: ReplicaSet/ingress-nginx-controller-655b8c4c8c
Init Containers:
linkerd-init:
Container ID: containerd://5d4018c056b715df5a224bd224c613a6af55c0093b3ec8f65cbe0d4709fc4807
Image: cr.l5d.io/linkerd/proxy-init:v1.4.0
Image ID: cr.l5d.io/linkerd/proxy-init@sha256:60d12fbb0b4a53962a5c2a59b496b3ee20052d26c0c56fd2ee38fd7fae62146e
Port: <none>
Host Port: <none>
Args:
--incoming-proxy-port
4143
--outgoing-proxy-port
4140
--proxy-uid
2102
--inbound-ports-to-ignore
4190,4191,4567,4568
--outbound-ports-to-ignore
4567,4568
State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 10 Feb 2023 12:03:35 +0530
Finished: Fri, 10 Feb 2023 12:03:35 +0530
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 10m
memory: 10Mi
Environment: <none>
Mounts:
/run from linkerd-proxy-init-xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rh8x9 (ro)
Containers:
linkerd-proxy:
Container ID: containerd://0223d738f7151fc4ea922940103fc6b61f587839ea7d2d7721bf3f2bff209ad6
Image: cr.l5d.io/linkerd/proxy:stable-2.11.1
Image ID: cr.l5d.io/linkerd/proxy@sha256:91b53d4b39e4c058e5fc63b72dd7ab6fe7f7051869ec5251dc9c0d8287b2771f
Ports: 4143/TCP, 4191/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Fri, 10 Feb 2023 12:03:37 +0530
Ready: True
Restart Count: 0
Liveness: http-get http://:4191/live delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:4191/ready delay=2s timeout=1s period=10s #success=1 #failure=3
Environment:
_pod_name: ingress-nginx-controller-655b8c4c8c-4cbv8 (v1:metadata.name)
_pod_ns: ingress-nginx (v1:metadata.namespace)
_pod_nodeName: (v1:spec.nodeName)
LINKERD2_PROXY_LOG: warn,linkerd=info
LINKERD2_PROXY_LOG_FORMAT: plain
LINKERD2_PROXY_DESTINATION_SVC_ADDR: linkerd-dst-headless.linkerd.svc.cluster.local.:8086
LINKERD2_PROXY_DESTINATION_PROFILE_NETWORKS: 10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16
LINKERD2_PROXY_POLICY_SVC_ADDR: linkerd-policy.linkerd.svc.cluster.local.:8090
LINKERD2_PROXY_POLICY_WORKLOAD: $(_pod_ns):$(_pod_name)
LINKERD2_PROXY_INBOUND_DEFAULT_POLICY: all-unauthenticated
LINKERD2_PROXY_POLICY_CLUSTER_NETWORKS: 10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16
LINKERD2_PROXY_INBOUND_CONNECT_TIMEOUT: 100ms
LINKERD2_PROXY_OUTBOUND_CONNECT_TIMEOUT: 1000ms
LINKERD2_PROXY_CONTROL_LISTEN_ADDR: 0.0.0.0:4190
LINKERD2_PROXY_ADMIN_LISTEN_ADDR: 0.0.0.0:4191
LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR: 127.0.0.1:4140
LINKERD2_PROXY_INBOUND_LISTEN_ADDR: 0.0.0.0:4143
LINKERD2_PROXY_INBOUND_IPS: (v1:status.podIPs)
LINKERD2_PROXY_INBOUND_PORTS: 80,443
LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES: svc.cluster.local.
LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE: 10000ms
LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE: 10000ms
LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION: 25,587,3306,4444,5432,6379,9300,11211
LINKERD2_PROXY_DESTINATION_CONTEXT: {"ns":"$(_pod_ns)", "nodeName":"$(_pod_nodeName)"}
_pod_sa: (v1:spec.serviceAccountName)
_l5d_ns: linkerd
_l5d_trustdomain: cluster.local
LINKERD2_PROXY_IDENTITY_DIR: /var/run/linkerd/identity/end-entity
LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS: -----BEGIN CERTIFICATE-----
MIIBqTCCAU+gAwIBAgIQbEXWvQo/UYg26tTsXjtzWDAKBggqhkjOPQQDAjAlMSMw
IQYDVQQDExpyb290LmxpbmtlcmQuY2x1c3Rlci5sb2NhbDAeFw0yMjEwMjAxMzQ2
MDBaFw0yNDEwMTkxMzQ2MDBaMCUxIzAhBgNVBAMTGnJvb3QubGlua2VyZC5jbHVz
dGVyLmxvY2FsMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE1vIFdRBEMjySROyb
JtW63F3G4i6uEePITfbXZD52dumVn+Em5blpw8R28iK1XGJzTwOEgFOfoIZQPcD6
ry108KNhMF8wDgYDVR0PAQH/BAQDAgEGMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggr
BgEFBQcDAjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBRc+hwJnxDbIR8RdJi7
7SpJ5bEPhTAKBggqhkjOPQQDAgNIADBFAiEA7asOWitb2KH+UxUIcYzIAfFz8pIv
tX6FGY29MF1f/poCIGes3svghf2eLDZ3UuTt1q0a4D4QCRxX50EVmXWpiH4y
-----END CERTIFICATE-----
LINKERD2_PROXY_IDENTITY_TOKEN_FILE: /var/run/secrets/kubernetes.io/serviceaccount/token
LINKERD2_PROXY_IDENTITY_SVC_ADDR: linkerd-identity-headless.linkerd.svc.cluster.local.:8080
LINKERD2_PROXY_IDENTITY_LOCAL_NAME: $(_pod_sa).$(_pod_ns).serviceaccount.identity.linkerd.cluster.local
LINKERD2_PROXY_IDENTITY_SVC_NAME: linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local
LINKERD2_PROXY_DESTINATION_SVC_NAME: linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
LINKERD2_PROXY_POLICY_SVC_NAME: linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
Mounts:
/var/run/linkerd/identity/end-entity from linkerd-identity-end-entity (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rh8x9 (ro)
controller:
Container ID: containerd://990651569f52eb4564621f528a12a61d8ee1f98793d619961d10f2a020a72534
Image: registry.k8s.io/ingress-nginx/controller:v1.5.1@sha256:4ba73c697770664c1e00e9f968de14e08f606ff961c76e5d7033a4a9c593c629
Image ID: registry.k8s.io/ingress-nginx/controller@sha256:4ba73c697770664c1e00e9f968de14e08f606ff961c76e5d7033a4a9c593c629
Ports: 80/TCP, 443/TCP
Host Ports: 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--default-backend-service=$(POD_NAMESPACE)/ingress-nginx-defaultbackend
--publish-service=$(POD_NAMESPACE)/ingress-nginx-controller-internal
--election-id=ingress-nginx-leader
--controller-class=k8s.io/ingress-nginx
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/ingress-nginx-controller
--default-ssl-certificate=ingress-nginx/default-cert
State: Running
Started: Fri, 10 Feb 2023 12:03:37 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 90Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-controller-655b8c4c8c-4cbv8 (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rh8x9 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-rh8x9:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
linkerd-proxy-init-xtables-lock:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
linkerd-identity-end-entity:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
QoS Class: Burstable
Node-Selectors: agentpool=nodepool
kubernetes.io/os=linux
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 10m (x3 over 56m) kubelet Readiness probe failed: Get "http://192.168.2.73:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 2m51s (x6 over 78m) kubelet Liveness probe failed: Get "http://192.168.2.73:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
kubectl -n ingress-nginx describe svc ingress-nginx-controller-internal
Name: ingress-nginx-controller-internal
Namespace: ingress-nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.5.1
helm.sh/chart=ingress-nginx-4.4.0
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: ingress-nginx
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
service.beta.kubernetes.io/azure-load-balancer-internal: true
service.beta.kubernetes.io/azure-load-balancer-internal-subnet: IngressControllerSubnet
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.0.69.40
IPs: 10.0.69.40
LoadBalancer Ingress: 192.168.3.68
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31179/TCP
Endpoints: 192.168.2.12:80,192.168.2.73:80
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 32741/TCP
Endpoints: 192.168.2.12:443,192.168.2.73:443
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal DeletingLoadBalancer 2m47s (x13795 over 4d8h) service-controller Deleting load balancer
Current state of ingress object:
kubectl -n jenkins get all,ing -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/jenkins-0 2/2 Running 0 4d 192.168.2.15 aks-nodepool-21640858-vmss000024 <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/jenkins ClusterIP 10.0.216.93 <none> 8080/TCP 115d app.kubernetes.io/component=jenkins-controller,app.kubernetes.io/instance=jenkins
service/jenkins-agent ClusterIP 10.0.92.33 <none> 50000/TCP 115d app.kubernetes.io/component=jenkins-controller,app.kubernetes.io/instance=jenkins
service/jenkins-jenkins-azure-front-door ExternalName <none> ####### <none> 115d <none>
NAME READY AGE CONTAINERS IMAGES
statefulset.apps/jenkins 1/1 115d jenkins,config-reload jenkins/jenkins:2.375.1-jdk11,kiwigrid/k8s-sidecar:1.15.0
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress.networking.k8s.io/jenkins <none> ###### 192.168.3.68 80, 443 115d
kubectl -n jenkins describe ing jenkins
Name: jenkins
Labels: app.kubernetes.io/component=jenkins-controller
app.kubernetes.io/instance=jenkins
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=jenkins
helm.sh/chart=jenkins-4.2.20
Namespace: jenkins
Address: 192.168.3.68
Ingress Class: <none>
Default backend: <default>
TLS:
jenkins-tls terminates jenkins.dev.##### ,jenkins.######
Rules:
Host Path Backends
---- ---- --------
jenkins.dev.shs.saas.temenos.cloud
jenkins:8080 (192.168.2.15:8080)
Annotations: cert-manager.io/cluster-issuer: letsencrypt-prod
external-dns.alpha.kubernetes.io/hostname: #################
external-dns.alpha.kubernetes.io/ingress-hostname-source: annotation-only
external-dns.alpha.kubernetes.io/target: 20.31.21.59
kubernetes.io/ingress.allow-http: false
kubernetes.io/ingress.class: nginx
meta.helm.sh/release-name: jenkins
meta.helm.sh/release-namespace: jenkins
nginx.ingress.kubernetes.io/force-ssl-redirect: true
nginx.ingress.kubernetes.io/modsecurity-snippet: SecRuleEngine On
nginx.ingress.kubernetes.io/proxy-body-size: 50m
Events: <none>
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 30 (10 by maintainers)
Facing the same issue when I upgraded my cluster to 1.24.10. Simply uninstalling and reinstalling nginx-ingress with the health check probe line didn’t work for me. After checking the full story: https://github.com/Azure/AKS/issues/2907#issuecomment-1115721052
I set
externalTrafficPolicy=Localin my cluster, and it finally fixed the issue.For those who have a similar environment as mine, try reinstalling with
externalTrafficPolicy=LocalThis is for default namespace. If you’re using a different namespace, use -n YOUR_NAMESPACE.
Although healthStatus defaults is
true, I still set it explicitly to ensure my settings.The key points are:
After these, Azure kubernetes Load balancer would only have one Health probes.
(originally, there were two when my
externalTrafficPolicy=cluster). The port is chosen by Azure Load balancer, so there’s no need to set it manually, as shown in the image below:I also added healthz annotation to my ingress configuration and ingress didn’t work properly. But ingress-nginx ran when I fixed ingress.yaml.
old ingress.yamlfixed ingress.yaml(remove spec.defaultBackend and add no host http instead of spec.defautlBackend. (reference: https://kubernetes.io/docs/concepts/services-networking/ingress/#name-based-virtual-hosting) )This was probably due to the fact that the health probe was not being routed to web server. I added the route instead of spec.defaultBackend and now it works.
Facing same issue in our environment.
@strongjz: Closing this issue.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
If someone wants to write a static manifest like we have for aws and others, we can accept that to help get over the issues with aks > 1.23.
Or they can helms and notices to this issue for folks.
Also, an update to the docs.
For now, since it seems resolved with the @markmcgookin fix with the proper annotations.
/close
Full story: https://github.com/Azure/AKS/issues/2907#issuecomment-1115721052
Azure Load Balancer health check needs to change to HTTP and path /healthz
Can be accomplished by setting the annotation in the above comment.
I got this fixed in the end… partly ignorance on my part as I had read about this setting, but was unable to figure out where/how to set it.
So I uninstalled nginx-ingress with helm and re-installed it with the health check probe line and it’s working again fine.
I am also using Applucation Gateway, and I was able to set the health probe there to point to my ingress controller IP and then path /healthz which works great.
We just upgraded one of our 4 cluster to 1.24.9 and we are seeing this too now.
leaving a bread crumb for this issue found on the microsoft website… ingress popped up instantly when this was applied…
need to add this to your service yaml, metadata:annotations:
“service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz”
here is the link to the microsoft Q&A item: Microsoft Q&A
i have the same issue in a aks cluster version 1.23.8 helm chart 4.0.18 nginx version 1.19.10 multiple ingress controller but the issue its present only in the internal ingress