ingress-nginx: Health check error

Hello,

My ingress controller suddenly stopped working. This is the message that I get. I have deployed it in the past following exactly the instructions here: https://kubernetes.github.io/ingress-nginx/deploy/ Everything was working, but after I restarted kubernetes and docker it doesn’t work anymore. I tried to redeploy it but still. I am running on CentOS 7.

healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 8
  • Comments: 37 (9 by maintainers)

Most upvoted comments

I’ve been facing this issue too and it got fixed after solving a couple of issues in my ingress resources. In most of the cases, I could see how there were ingress resources deployed in my cluster which had no any endpoints available or even deployed. After deleting those useless and problematic ingress resources, nginx started to start up normally.

@eljefedelrodeodeljefe Regarding your comment about how one can use this in production, I have to say that we run this component in production since 3 years and so far it hasn’t cause any outage.

I am also getting this error :

healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: i/o timeout

And after some time, it works again and issue is intermittent, and without making any change , it started working.

Any solution to overcome this?

@OGKevin yes, you can check this using the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev

I think this issue is not being treated with enough tenacity. Honestly I wonder how one can use this in production, as even our staging tests for months keep failing on issues like this.

Restarting the controller is not a fix; It’s sysadmin patch work.

Anybody solved it? I still suffer this issue.

I have the same behavior with AKS 1.14.8 and nginx-controller 0.27.1 + HPA. @kwladyka

I still have this issue:

kubectl --context=etingroup-production get node
NAME                                  STATUS   ROLES    AGE   VERSION
gke-production-pool-1-ce587bf0-rxwq   Ready    <none>   31m   v1.14.7-gke.10

image


ingress version
`tag: "0.26.1"`

Is it possible it fail, because third party pod which has nginx-ingress service fail? Will nginx-ingress fail, because third party app fail?

In my case I will wait. I don’t see other choice.