kubernetes-ingress: Controller not detecting a. service

We are having problem with HAProxy not detecting services within the cluster, when routing requests to cert-manager pods. For example, HAProxy controller returns:

022/08/16 13:46:12 ERROR   ingress/ingress.go:245 Ingress 'development/my-service': service 'development/cm-acme-http-solver-498h4' does not exist
2022/08/16 13:46:12 INFO    handler/https.go:123 removing client TLS authentication

And the rule from Ingress is like so:

  ingressClassName: external-haproxy
  rules:
  - host: example.com
    http:
      paths:
      - backend:
          service:
            name: cm-acme-http-solver-498h4
            port:
              number: 8089
        path: /.well-known/path/to/acme-challenge
        pathType: ImplementationSpecific

I’ve checked haproxy.cfg and I cannot find the internal IP either of the mentioned service or the pod.

We are also having split-horizon dns with two HAProxies - internal and external. I’ve also found the offending line: https://github.com/haproxytech/kubernetes-ingress/blob/v1.8.3/pkg/ingress/ingress.go#L245

Do you know why this happens? It seems that haproxy cannot find a service, despite service existing in the cluster.

About this issue

Original URL
State: closed
Created 2 years ago
Reactions: 1
Comments: 51 (15 by maintainers)

Most upvoted comments

@ivanmatmati looks good. Our instances are running for 6 days now. No issues so far. 👍

dschuldt on Nov 1, 2022

@ivanmatmati thank you and the whole team!

Could you please explain, why the commit is classified as minor, since this issue was a critical bug in production?

Again thanks

dschuldt on Oct 26, 2022

Hi @evandam , it’s ok to use 1.9, it contains the precious commit. Nightly build was relevant until this release.

ivanmatmati on Oct 24, 2022

Hi all, could you test the next nightly build (available at 2am CET) ? It could be it solves the issue. The idea is that you test it in a decent amount of time to see if issue is still present. Thanks.

ivanmatmati on Oct 21, 2022

Hi, we’ve done a session with @LarsBingBong , but unfortunately the issue was not reproduced.We are in the process of creating a lab for long running issues. This one will be addressed on this platform.

ivanmatmati on Oct 14, 2022

Any updates? This happens really often 😦

ognjenVlad on Oct 13, 2022

I think we can perfectly reproduce this issue again and again. If by any chance it can help the project by us demoing that we can reproduce and you collecting metrics, dumps, logs and whatnot we’ll happily set aside some time for this.

Ready for such a session @ivanmatmati

LarsBingBong on Sep 20, 2022

The thing that makes this one difficult is the underlying code dealing with syncing has been there for ages. I had a look and while I’m not a go programmer, there is code depending on the k8s informer to get updates from the control plane.

I’m wondering is this bug happening because the informer has an error and reconnects, missing data while there is an error? It is weird that this seems to happen for all haproxy pods, though the same cause could affect if it is a momentary control plane reconnection issue (tbh I didn’t check all pods so one may have been fine).

How does the code handle errors with the informer and resync?

ocdi on Sep 7, 2022

It is happening also for us, we are running 1.8.3 docker image version, and even if we upgraded to 1.8.4 docker image version, is the same behavior. As @ocdi said: seems to be a resolution when a new service is deployed: delete the haproxy pods, delete the service pods, and after that is working. We would love to see a hotfix for this asap.

idonca on Sep 6, 2022

@ivanmatmati It seems that restarting did the job! @ocdi @nopsenica @ivanmatmati thank you! But why did the controller failed to find the configuration?

petar-nikolovski-cif on Aug 19, 2022