ingress-nginx: Chart Option `bind-address` is not honored at startup and causes nginx to fail
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): v1.0.4
Kubernetes version (use kubectl version): v1.21.5+k3s2
Environment:
- Cloud provider or hardware configuration: bare metal, AMD Ryzen Embedded V1605B
- OS (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
- Kernel (e.g.
uname -a): 4.19.194-3 - Install tools: Rancher k3s automated installation
Please mention how/where was clsuter created like kubeadm/kops/minikube/kind etc.
- Basic cluster related info:
kubectl version: v1.21.5+k3s2kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
jupiter Ready control-plane,master 480d v1.21.5+k3s2 192.168.2.203 <none> Debian GNU/Linux 10 (buster) 4.19.0-17-amd64 containerd://1.4.11-k3s1
- How was the ingress-nginx-controller installed: Helm Chart
- If helm was used then please show output of
helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION applications dev-services 9 2021-10-29 10:43:43.89215712 +1100 AEDT deployed gitlab-runner-0.34.0 14.4.0 authelia menfin-services 4 2021-10-28 17:42:59.726584947 +1100 AEDT deployed authelia-0.6.3 4.31.0 containers dev-services 13 2021-10-29 10:43:41.59080221 +1100 AEDT deployed gitlab-runner-0.34.0 14.4.0 default dev-services 14 2021-10-29 10:43:39.315032902 +1100 AEDT deployed gitlab-runner-0.34.0 14.4.0 harbor-menfin-system menfin-system 9 2021-10-14 17:26:31.857105883 +1100 AEDT deployed harbor-1.7.3 2.3.3 influxdb menfin-services 5 2020-07-14 14:09:15.570129099 +1000 AEST deployed influxdb-4.7.1 1.8.0 ingress-nginx-menfin-services menfin-services 2 2021-10-30 14:23:11.670595939 +1100 AEDT deployed ingress-nginx-4.0.6 1.0.4 ingress-nginx-menfin-system menfin-system 1 2021-10-30 14:20:58.730357517 +1100 AEDT deployed ingress-nginx-4.0.6 1.0.4 minio menfin-system 17 2021-04-08 12:38:38.83367455 +1000 AEST deployed minio-8.0.10 master openfaas openfaas 1 2021-06-19 15:15:41.565615584 +1000 AEST deployed openfaas-7.3.2 registry menfin-system 13 2021-01-17 05:08:37.146361067 +1100 AEDT deployed docker-registry-1.10.0 2.7.1 sleep-companion dev-services 9 2021-10-29 10:43:46.12323137 +1100 AEDT deployed gitlab-runner-0.34.0 14.4.0- If helm was used then please show output of
helm get values --namespace menfin-system ingress-nginx-menfin-system(primary deployment - working fine)
USER-SUPPLIED VALUES: controller: admissionWebhooks: port: "10843" config: bind-address: 192.168.2.180 disable-ipv6: "true" use-forwarded-headers: "true" containerPort: http: "10080" https: "10443" dnsPolicy: ClusterFirstWithHostNet electionID: ingress-controller-leader-menfin-system extraArgs: default-server-port: "10081" healthz-port: "10254" profiler-port: "10245" status-port: "10246" stream-port: "10247" hostNetwork: true ingressClassResource: name: ingress-nginx-menfin-system kind: DaemonSet reportNodeInternalIp: true service: type: ClusterIP- If helm was used then please show output of
helm get values --namespace menfin-services ingress-nginx-menfin-services(secondary deployment - not working)
USER-SUPPLIED VALUES: controller: admissionWebhooks: port: "11843" config: bind-address: 192.168.2.181 disable-ipv6: "true" containerPort: http: "11080" https: "11443" dnsPolicy: ClusterFirstWithHostNet electionID: ingress-controller-leader-menfin-services extraArgs: default-server-port: "11081" healthz-port: "11254" profiler-port: "11245" status-port: "11246" stream-port: "11247" hostNetwork: true ingressClassResource: name: ingress-nginx-menfin-services kind: DaemonSet reportNodeInternalIp: true service: type: ClusterIP - If helm was used then please show output of
What happened:
Please read Anything else we need to know section as it contains valuable context and helpful information.
The second instance of the ingress-nginx controller fails to start with the following error message.
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v1.0.4
Build: 9b78b6c197b48116243922170875af4aa752ee59
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.9
-------------------------------------------------------------------------------
F1030 01:34:44.558408 8 main.go:67] port 80 is already in use. Please check the flag --http-port
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0x1)
k8s.io/klog/v2@v2.10.0/klog.go:1026 +0x8a
k8s.io/klog/v2.(*loggingT).output(0x2877840, 0x3, {0x0, 0x0}, 0xc00020cd90, 0x1, {0x1f5eeea, 0x2878380}, 0xc00020e430, 0x0)
k8s.io/klog/v2@v2.10.0/klog.go:975 +0x63d
k8s.io/klog/v2.(*loggingT).printDepth(0x1, 0x1, {0x0, 0x0}, {0x0, 0x0}, 0x40c5fe, {0xc00020e430, 0x1, 0x1})
k8s.io/klog/v2@v2.10.0/klog.go:735 +0x1ba
k8s.io/klog/v2.(*loggingT).print(...)
k8s.io/klog/v2@v2.10.0/klog.go:717
k8s.io/klog/v2.Fatal(...)
k8s.io/klog/v2@v2.10.0/klog.go:1494
main.main()
k8s.io/ingress-nginx/cmd/nginx/main.go:67 +0x1d3
goroutine 4 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x0)
k8s.io/klog/v2@v2.10.0/klog.go:1169 +0x6a
k8s.io/klog/v2@v2.10.0/klog.go:420 +0xfb
created by k8s.io/klog/v2.init.0
What you expected to happen:
The second ingress controller starts normally.
How to reproduce it:
- on a Linux machine assigned 2 or more externally reachable IP addresses (ie, 192.168.2.180 and 192.168.2.180)
- create/launch a k8s cluster
- install helm and address the ingress-nginx helm repository
- create 2 namespaces (ie, menfin-system and menfin-services)
- deploy the first ingress-nginx in menfin-system using the first set of chart values. This should work fine
- deploy the second ingress-nginx in menfin-services using the second set of chart values. The pod will crash loop with the error message above.
Anything else we need to know:
I just recently tried to upgrade from v3.35.0 of the ingress-nginx helm chart to v4.0.6 ahead of migrating to k8s 1.22. Until now, I was able to run multiple instances of ingress-nginx and have each of them to listen on a dedicated IP address (192.168.2.180 and 192.168.2.180, and a few more).
Since upgrading to v4.0.6, the 2nd ingress controller fails to start with the error described above.
I looked into the code and I think this issue is related to #6990 and #7467 which were created because of #6988.
As @rikatz mentioned in #6990, the PR doesn’t look at bind-address when ln, err := _net.Listen("tcp", fmt.Sprintf(":%v", p)) This effectively forces Listen() to attempt to bind on all available interfaces.
The consequence is the call fails if at least one of the interfaces has the specified port already listening which leads to IsPortAvailable() to return an error.
I understand the conversation in #6990 and the challenge of a configMap vs CLI args but this is also a major regression / behavior change.
As a possible suggestion, could it be possible to (optionally) pass the IP address when calling IsPortAvailable() related to bind-address to the check is narrower?
/kind bug
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 2
- Comments: 16 (6 by maintainers)
@rikatz Any update on this please ?
@aureq yeah, so you are our corner case! welcome! 😃
I agree we should improve this then, in a sense of that the bindAddress SHOULD be honored (or this check ignored, which is bad IMO)
Do you have any suggestions or want to work on a PR for this? Otherwise I can think on some approach here!
Thanks
Thank you for the update. Helps.