ingress-nginx: Service ... does not have any active Endpoint [when it actually does]

NGINX Ingress controller version:

NGINX Ingress controller
  Release:       v0.34.1
  Build:         v20200715-ingress-nginx-2.11.0-8-gda5fa45e2
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.19.1

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-beta.0", GitCommit:"e7f962ba86f4ce7033828210ca3556393c377bcc", GitTreeState:"clean", BuildDate:"2020-01-15T08:26:26Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.9", GitCommit:"e3808385c7b3a3b86db714d67bdd266dc2b6ab62", GitTreeState:"clean", BuildDate:"2020-07-15T20:50:36Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AKS
  • OS (e.g. from /etc/os-release): Windows
  • Install tools: Helm

What happened: Got a Bad Gateway response. Checked logs. This is the relevant line:

"GET some/end/point HTTP/1.1" 502 157 "-" "PostmanRuntime/7.26.3" 1570 1.248 [default-my-service-80] [] 10.244.2.36:80, 10.244.2.36:80, 10.244.2.36:80 0, 0, 0 1.244, 0.004, 0.000 502, 502, 502 cc30e5d47f23e3a117181ff94479bc6f
W0907 17:35:19.222358       7 controller.go:916] Service "default/my-service" does not have any active Endpoint.

Checked the svc

Name:              my-service
Namespace:         default
Annotations:       meta.helm.sh/release-name: data-plane
                   meta.helm.sh/release-namespace: default
Selector:          app=my-service
Type:              ClusterIP
IP:                10.0.222.22
Port:              http  80/TCP
TargetPort:        80/TCP
Endpoints:         10.244.2.36:80
Session Affinity:  None
Events:            <none>

Checked the ep, gave me the same endpoint details, not <none>.

I also hit the service from another pod using curl, and that also worked as expected.

What you expected to happen: The requests to be routed to the correct endpoint that’s registered.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 31 (3 by maintainers)

Most upvoted comments

@aledbf could you please point out what was the problem? is it really necessary to upgrade to 0.41.2?

Issue

I also observe the same issue using Release: 0.33.0

–>

W0408 16:38:40.185591       7 controller.go:909] Service "ci/jenkinsci" does not have any active Endpoint.

Resources

kc get endpoints,svc,ingress -n ci
NAME                        ENDPOINTS            AGE
endpoints/jenkinsci         10.244.0.251:8080    20m
endpoints/jenkinsci-agent   10.244.0.251:50000   20m

NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
service/jenkinsci         ClusterIP   10.100.11.19     <none>        8080/TCP    20m
service/jenkinsci-agent   ClusterIP   10.104.183.224   <none>        50000/TCP   20m

NAME                           CLASS    HOSTS                             ADDRESS   PORTS   AGE
ingress.extensions/jenkinsci   <none>   jenkinsci.95.217.159.244.nip.io             80      20m
[snowdrop@h01-118 jenkins]$ kc describe -n ci service/jenkinsci
Name:              jenkinsci
Namespace:         ci
Labels:            app.kubernetes.io/component=jenkins-controller
                   app.kubernetes.io/instance=jenkinsci
                   app.kubernetes.io/managed-by=Helm
                   app.kubernetes.io/name=jenkins
                   helm.sh/chart=jenkins-3.3.3
Annotations:       meta.helm.sh/release-name: jenkinsci
                   meta.helm.sh/release-namespace: ci
Selector:          app.kubernetes.io/component=jenkins-controller,app.kubernetes.io/instance=jenkinsci
Type:              ClusterIP
IP:                10.100.11.19
Port:              http  8080/TCP
TargetPort:        8080/TCP
Endpoints:         10.244.0.251:8080
Session Affinity:  None
Events:            <none>
[snowdrop@h01-118 jenkins]$ kc describe -n ci endpoints/jenkinsci
Name:         jenkinsci
Namespace:    ci
Labels:       app.kubernetes.io/component=jenkins-controller
              app.kubernetes.io/instance=jenkinsci
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=jenkins
              helm.sh/chart=jenkins-3.3.3
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-04-08T16:38:40Z
Subsets:
  Addresses:          10.244.0.251
  NotReadyAddresses:  <none>
  Ports:
    Name  Port  Protocol
    ----  ----  --------
    http  8080  TCP

Events:  <none>
[snowdrop@h01-118 jenkins]$ kc describe -n ci ingress.extensions/jenkinsci
Name:             jenkinsci
Namespace:        ci
Address:
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
  Host                             Path  Backends
  ----                             ----  --------
  jenkinsci.95.217.159.244.nip.io
                                      jenkinsci:8080 (10.244.0.251:8080)
Annotations:                       kubernetes.io/ingress.class: nginx
                                   meta.helm.sh/release-name: jenkinsci
                                   meta.helm.sh/release-namespace: ci
Events:
  Type    Reason  Age    From                      Message
  ----    ------  ----   ----                      -------
  Normal  CREATE  20m    nginx-ingress-controller  Ingress ci/jenkinsci
  Normal  UPDATE  13m    nginx-ingress-controller  Ingress ci/jenkinsci
  Normal  CREATE  5m25s  nginx-ingress-controller  Ingress ci/jenkinsci

I also faced same issue. But it was due to my mistake.

Instead of baremetal deployment which we were using, I deployed cloud type. incorrect: kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.43.0/deploy/static/provider/cloud/deploy.yaml correct: kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.43.0/deploy/static/provider/baremetal/deploy.yaml

After deploying correct one, my issue got resolved.

@aledbf

Why people still facing same problem, but this issue closed.

We have just experienced this problem as well. (AKS v1.21.2 with ingress-nginx 1.0.4) Has anyone found a solution yet?

Same problem here. It happens after my cluster has scaled-down and then up and after that I am not able to find a solution. Chart version 4.5.2

Hi – faced the same issue, with a bitnami chart. You may or may not have the same problem as I. I looked at the service chart (based on other google results which said it might be selector based) which had:

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: rabbitmq-dev
    meta.helm.sh/release-namespace: dev
    service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: environment=nca,owner=devops,service=rabbitmq-nca,orchestration=helm
  labels:
    app.kubernetes.io/instance: rabbitmq-dev
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: rabbitmq
    helm.sh/chart: rabbitmq-8.30.2
  name: rabbitmq-dev
  namespace: dev
spec:
  clusterIP: 172.20.96.146
  clusterIPs:
  - 172.20.96.146
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - OMITTED
  selector:
    app.kubernetes.io/name: rabbitmq
    app.kubernetes.io/instance: rabbitmq-dev

Note the last two lines: we have 2 selectors trying to match 2 labels, nginx does not seem to like that. I changed it to

  selector:
    app.kubernetes.io/name: rabbitmq

And things magically started to work. Nginx hooked up the rabbitmq pod on to nginx’s associated ELB, the external-dns module took over and gave me a route53 dns entry, and login through the browser now worked. What you want to see is something like this:

# kubectl get ingress
NAME           CLASS       HOSTS                               ADDRESS                                                                  PORTS   AGE
rabbitmq-dev   dev-nginx   rabbitmq-dev.mydns.ca   ad4bab415625c4583a636a8642eec9fd-911617965.us-east-2.elb.amazonaws.com   80      9m18s
# kubectl get svc
NAME                                                   TYPE           CLUSTER-IP      EXTERNAL-IP                                                              PORT(S)                                 AGE
nginx-ingress-dev-ingress-nginx-controller             LoadBalancer   172.20.41.6     deadbeefc4583a636a8642eec9fd-911617965.us-east-2.elb.amazonaws.com   80:31900/TCP,443:32530/TCP              5h52m

Since I am using helm, and the bitnami common chart, the selector gets put in through the template in svc.yaml

selector: {{ include "common.labels.matchLabels" . | nindent 4 }}

you then have to go and find charts/common/templates/_labels.tpl, and remove the entry there:

{{/*
Labels to use on deploy.spec.selector.matchLabels and svc.spec.selector
*/}}
{{- define "common.labels.matchLabels" -}}
app.kubernetes.io/name: {{ include "common.names.name" . }}
#app.kubernetes.io/instance: {{ .Release.Name }}
{{- end -}}

You can see I commented the one entry out. Re-running helm reinstalls the chart with the corrected svc entry.

I have a script (I always recommend automating – you have a repeatable thing and a record!) that I can share that shows how I install the rabbitmq chart, be sure to get the ingressClassName to match the ingressClassName that you see when you describe the nginx pod.

function deploy_rabbitmq() {
  cd $DIRECTORY
  rabbitmq_tags="environment\=$APPLICATION_NAMESPACE\,owner\=devops\,service\=rabbitmq-$APPLICATION_NAMESPACE\,orchestration\=helm"
  local env=$1
  create_rabbit_secret $env

  cd $DIRECTORY
  local domain=$(yq eval '.domain' ../environment.yaml)
  cd ../../../devops-rabbitmq/helm3/bitnami
  helm3 upgrade \
    rabbitmq-$env \
    ./ \
    --install \
    --values ./values.yaml  \
    --set controller.scope.enabled=true \
    --set controller.scope.namespace=$env \
    --set serviceAccount.name=$env-rabbitmq-ha \
    --set nodeSelector.eks_namespace=$env \
    --set ingress.ingressClassName=$env-nginx \
    --set ingress.hostname="rabbitmq-$env-$RANDOM_SUFFIX.$domain" \
    --set service.annotations.'service\.beta\.kubernetes\.io/aws-load-balancer-additional-resource-tags'=$rabbitmq_tags \
    --set nodeSelector.eks_namespace=$env \
    --set serviceAccount.name=$env-rabbitmq-ha \
    --set existingSecret="" \
    --namespace $env \
    --debug
}

We are also facing the same issue with ingress nginx 1.1.1 on a baremetal K8s server. Is there any reply/support from the devs here as I can see quite a few folks are facing these issues

We’re on Release: 0.25.1 and just started seeing this issue on one of our clusters. Other clusters seem unaffected. Any idea what the easiest fix would be?