ingress-nginx: Chart Option `bind-address` is not honored at startup and causes nginx to fail

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): v1.0.4

Kubernetes version (use kubectl version): v1.21.5+k3s2

Environment:

  • Cloud provider or hardware configuration: bare metal, AMD Ryzen Embedded V1605B
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
  • Kernel (e.g. uname -a): 4.19.194-3
  • Install tools: Rancher k3s automated installation
    • Please mention how/where was clsuter created like kubeadm/kops/minikube/kind etc.
  • Basic cluster related info:
    • kubectl version: v1.21.5+k3s2
    • kubectl get nodes -o wide
NAME      STATUS   ROLES                  AGE    VERSION        INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION    CONTAINER-RUNTIME
jupiter   Ready    control-plane,master   480d   v1.21.5+k3s2   192.168.2.203   <none>        Debian GNU/Linux 10 (buster)   4.19.0-17-amd64   containerd://1.4.11-k3s1
  • How was the ingress-nginx-controller installed: Helm Chart
    • If helm was used then please show output of helm ls -A
    NAME                            NAMESPACE       REVISION        UPDATED                                         STATUS          CHART                   APP VERSION
    applications                    dev-services    9               2021-10-29 10:43:43.89215712 +1100 AEDT         deployed        gitlab-runner-0.34.0    14.4.0     
    authelia                        menfin-services 4               2021-10-28 17:42:59.726584947 +1100 AEDT        deployed        authelia-0.6.3          4.31.0     
    containers                      dev-services    13              2021-10-29 10:43:41.59080221 +1100 AEDT         deployed        gitlab-runner-0.34.0    14.4.0     
    default                         dev-services    14              2021-10-29 10:43:39.315032902 +1100 AEDT        deployed        gitlab-runner-0.34.0    14.4.0     
    harbor-menfin-system            menfin-system   9               2021-10-14 17:26:31.857105883 +1100 AEDT        deployed        harbor-1.7.3            2.3.3      
    influxdb                        menfin-services 5               2020-07-14 14:09:15.570129099 +1000 AEST        deployed        influxdb-4.7.1          1.8.0      
    ingress-nginx-menfin-services   menfin-services 2               2021-10-30 14:23:11.670595939 +1100 AEDT        deployed        ingress-nginx-4.0.6     1.0.4      
    ingress-nginx-menfin-system     menfin-system   1               2021-10-30 14:20:58.730357517 +1100 AEDT        deployed        ingress-nginx-4.0.6     1.0.4      
    minio                           menfin-system   17              2021-04-08 12:38:38.83367455 +1000 AEST         deployed        minio-8.0.10            master     
    openfaas                        openfaas        1               2021-06-19 15:15:41.565615584 +1000 AEST        deployed        openfaas-7.3.2                     
    registry                        menfin-system   13              2021-01-17 05:08:37.146361067 +1100 AEDT        deployed        docker-registry-1.10.0  2.7.1      
    sleep-companion                 dev-services    9               2021-10-29 10:43:46.12323137 +1100 AEDT         deployed        gitlab-runner-0.34.0    14.4.0     
    
    • If helm was used then please show output of helm get values --namespace menfin-system ingress-nginx-menfin-system (primary deployment - working fine)
    USER-SUPPLIED VALUES:
    controller:
      admissionWebhooks:
        port: "10843"
      config:
        bind-address: 192.168.2.180
        disable-ipv6: "true"
        use-forwarded-headers: "true"
      containerPort:
        http: "10080"
        https: "10443"
      dnsPolicy: ClusterFirstWithHostNet
      electionID: ingress-controller-leader-menfin-system
      extraArgs:
        default-server-port: "10081"
        healthz-port: "10254"
        profiler-port: "10245"
        status-port: "10246"
        stream-port: "10247"
      hostNetwork: true
      ingressClassResource:
        name: ingress-nginx-menfin-system
      kind: DaemonSet
      reportNodeInternalIp: true
      service:
        type: ClusterIP
    
    • If helm was used then please show output of helm get values --namespace menfin-services ingress-nginx-menfin-services (secondary deployment - not working)
    USER-SUPPLIED VALUES:
    controller:
      admissionWebhooks:
        port: "11843"
      config:
        bind-address: 192.168.2.181
        disable-ipv6: "true"
      containerPort:
        http: "11080"
        https: "11443"
      dnsPolicy: ClusterFirstWithHostNet
      electionID: ingress-controller-leader-menfin-services
      extraArgs:
        default-server-port: "11081"
        healthz-port: "11254"
        profiler-port: "11245"
        status-port: "11246"
        stream-port: "11247"
      hostNetwork: true
      ingressClassResource:
        name: ingress-nginx-menfin-services
      kind: DaemonSet
      reportNodeInternalIp: true
      service:
        type: ClusterIP
    

What happened:

Please read Anything else we need to know section as it contains valuable context and helpful information.

The second instance of the ingress-nginx controller fails to start with the following error message.

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.0.4
  Build:         9b78b6c197b48116243922170875af4aa752ee59
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.19.9
-------------------------------------------------------------------------------
F1030 01:34:44.558408       8 main.go:67] port 80 is already in use. Please check the flag --http-port
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0x1)
    k8s.io/klog/v2@v2.10.0/klog.go:1026 +0x8a
k8s.io/klog/v2.(*loggingT).output(0x2877840, 0x3, {0x0, 0x0}, 0xc00020cd90, 0x1, {0x1f5eeea, 0x2878380}, 0xc00020e430, 0x0)
    k8s.io/klog/v2@v2.10.0/klog.go:975 +0x63d
k8s.io/klog/v2.(*loggingT).printDepth(0x1, 0x1, {0x0, 0x0}, {0x0, 0x0}, 0x40c5fe, {0xc00020e430, 0x1, 0x1})
    k8s.io/klog/v2@v2.10.0/klog.go:735 +0x1ba
k8s.io/klog/v2.(*loggingT).print(...)
    k8s.io/klog/v2@v2.10.0/klog.go:717
k8s.io/klog/v2.Fatal(...)
    k8s.io/klog/v2@v2.10.0/klog.go:1494
main.main()
    k8s.io/ingress-nginx/cmd/nginx/main.go:67 +0x1d3
goroutine 4 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x0)
    k8s.io/klog/v2@v2.10.0/klog.go:1169 +0x6a
    k8s.io/klog/v2@v2.10.0/klog.go:420 +0xfb
created by k8s.io/klog/v2.init.0

What you expected to happen:

The second ingress controller starts normally.

How to reproduce it:

  1. on a Linux machine assigned 2 or more externally reachable IP addresses (ie, 192.168.2.180 and 192.168.2.180)
  2. create/launch a k8s cluster
  3. install helm and address the ingress-nginx helm repository
  4. create 2 namespaces (ie, menfin-system and menfin-services)
  5. deploy the first ingress-nginx in menfin-system using the first set of chart values. This should work fine
  6. deploy the second ingress-nginx in menfin-services using the second set of chart values. The pod will crash loop with the error message above.

Anything else we need to know:

I just recently tried to upgrade from v3.35.0 of the ingress-nginx helm chart to v4.0.6 ahead of migrating to k8s 1.22. Until now, I was able to run multiple instances of ingress-nginx and have each of them to listen on a dedicated IP address (192.168.2.180 and 192.168.2.180, and a few more).

Since upgrading to v4.0.6, the 2nd ingress controller fails to start with the error described above.

I looked into the code and I think this issue is related to #6990 and #7467 which were created because of #6988. As @rikatz mentioned in #6990, the PR doesn’t look at bind-address when ln, err := _net.Listen("tcp", fmt.Sprintf(":%v", p)) This effectively forces Listen() to attempt to bind on all available interfaces.

The consequence is the call fails if at least one of the interfaces has the specified port already listening which leads to IsPortAvailable() to return an error.

I understand the conversation in #6990 and the challenge of a configMap vs CLI args but this is also a major regression / behavior change.

As a possible suggestion, could it be possible to (optionally) pass the IP address when calling IsPortAvailable() related to bind-address to the check is narrower?

/kind bug

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 2
  • Comments: 16 (6 by maintainers)

Most upvoted comments

@rikatz Any update on this please ?

@aureq yeah, so you are our corner case! welcome! 😃

I agree we should improve this then, in a sense of that the bindAddress SHOULD be honored (or this check ignored, which is bad IMO)

Do you have any suggestions or want to work on a PR for this? Otherwise I can think on some approach here!

Thanks

Thank you for the update. Helps.