kubernetes-ingress: Controller not detecting a. service

We are having problem with HAProxy not detecting services within the cluster, when routing requests to cert-manager pods. For example, HAProxy controller returns:

022/08/16 13:46:12 ERROR   ingress/ingress.go:245 Ingress 'development/my-service': service 'development/cm-acme-http-solver-498h4' does not exist
2022/08/16 13:46:12 INFO    handler/https.go:123 removing client TLS authentication

And the rule from Ingress is like so:

  ingressClassName: external-haproxy
  rules:
  - host: example.com
    http:
      paths:
      - backend:
          service:
            name: cm-acme-http-solver-498h4
            port:
              number: 8089
        path: /.well-known/path/to/acme-challenge
        pathType: ImplementationSpecific

I’ve checked haproxy.cfg and I cannot find the internal IP either of the mentioned service or the pod.

We are also having split-horizon dns with two HAProxies - internal and external. I’ve also found the offending line: https://github.com/haproxytech/kubernetes-ingress/blob/v1.8.3/pkg/ingress/ingress.go#L245

Do you know why this happens? It seems that haproxy cannot find a service, despite service existing in the cluster.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 51 (15 by maintainers)

Most upvoted comments

@ivanmatmati looks good. Our instances are running for 6 days now. No issues so far. 👍

@ivanmatmati thank you and the whole team!

Could you please explain, why the commit is classified as minor, since this issue was a critical bug in production?

Again thanks

Hi @evandam , it’s ok to use 1.9, it contains the precious commit. Nightly build was relevant until this release.

Hi all, could you test the next nightly build (available at 2am CET) ? It could be it solves the issue. The idea is that you test it in a decent amount of time to see if issue is still present. Thanks.

Hi, we’ve done a session with @LarsBingBong , but unfortunately the issue was not reproduced.We are in the process of creating a lab for long running issues. This one will be addressed on this platform.

Any updates? This happens really often 😦

I think we can perfectly reproduce this issue again and again. If by any chance it can help the project by us demoing that we can reproduce and you collecting metrics, dumps, logs and whatnot we’ll happily set aside some time for this.

Ready for such a session @ivanmatmati

The thing that makes this one difficult is the underlying code dealing with syncing has been there for ages. I had a look and while I’m not a go programmer, there is code depending on the k8s informer to get updates from the control plane.

I’m wondering is this bug happening because the informer has an error and reconnects, missing data while there is an error? It is weird that this seems to happen for all haproxy pods, though the same cause could affect if it is a momentary control plane reconnection issue (tbh I didn’t check all pods so one may have been fine).

How does the code handle errors with the informer and resync?

It is happening also for us, we are running 1.8.3 docker image version, and even if we upgraded to 1.8.4 docker image version, is the same behavior. As @ocdi said: seems to be a resolution when a new service is deployed: delete the haproxy pods, delete the service pods, and after that is working. We would love to see a hotfix for this asap.

@ivanmatmati It seems that restarting did the job! @ocdi @nopsenica @ivanmatmati thank you! But why did the controller failed to find the configuration?