minikube: Pod unable to reach itself through a service (unless --cni=true is set)

Minikube version (use minikube version): minikube version: v0.16.0 and k8s version v1.6.4 (But I tried v0.17.1 and v0.19.1 too) Environment:

  • OS (e.g. from /etc/os-release):
  • VM Driver : Virtualbox
  • ISO version : minikube-v1.0.6.iso
  • Install tools: curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.16.0/minikube-darwin-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
  • Others:

What happened: If a pod has a service which points to the pod, the pod cannot reach itself through the service IP. Other pods can reach the service and the pod itself can reach other services. This means all components (especially clustered & distributed systems) which expect to talk to themselves for leader election fail to startup properly.

What you expected to happen: I expect the pod to be able to reach itself.

How to reproduce it (as minimally and precisely as possible): It happens with all our services and pods but I can reproduce it with kube-system pods too.

Get service IP : kubectl describe svc kube-dns --namespace kube-system | grep IP:. I get 10.0.0.10 Get endpoint IP: kubectl describe svc kube-dns --namespace kube-system | grep Endpoints. I get 172.17.0.3

Exec into the pod: kubectl --namespace kube-system exec -it kube-dns-v20-54536 sh

Run the following : Using the service IP hangs

Name: kubernetes-dashboard.kube-system.svc.cluster.local Address 1: 10.0.0.212 kubernetes-dashboard.kube-system.svc.cluster.local / # nslookup kubernetes-dashboard.kube-system.svc.cluster.local 10.0.0.10 Server: 10.0.0.10 ^C

Using the endoint IP works

/ # nslookup kubernetes-dashboard.kube-system.svc.cluster.local 172.17.0.3 Server: 172.17.0.3 Address 1: 172.17.0.3 kube-dns-v20-54536

Name: kubernetes-dashboard.kube-system.svc.cluster.local Address 1: 10.0.0.212 kubernetes-dashboard.kube-system.svc.cluster.local

Accessing a different service IP works. Using the kubernetes-dashboard IP from the last command / # telnet 10.0.0.212 80 get HTTP/1.1 400 Bad Request Content-Type: text/plain Connection: close

400 Bad RequestConnection closed by foreign host

Anything else do we need to know: minikube v0.17.1 works with K8S 1.5.3 I tried the following and it worked. So, I suspect it has something to do with upgrading minikube to v0.17.1 and K8S to v1.6.4

curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.17.1/minikube-darwin-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
rm -rf ~/.minikube
minikube start --kubernetes-version 1.5.3 --cpus 4 --memory 6096 --v=8 --logtostderr

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 10
  • Comments: 58 (17 by maintainers)

Commits related to this issue

Most upvoted comments

For me, this helped to fix it https://github.com/kubernetes/kubernetes/issues/20475#issuecomment-190995739

So you can do:

minikube ssh
sudo ip link set docker0 promisc on

Maybe this fix can be merged directly in minikube so people won;t need to do custom things?

This is not solved, right? Saw it on minikube version: v0.28.2

Encountered this bug on

minikube version: v1.3.1 commit: ca60a424ce69a4d79f502650199ca2b52f29e631

Still seeing this in minikube v0.23.0

@sgandon , yes, is workaround but is not a fix. I used as workaround an hostAliases in the deployment with the same name of the service (i use minikube in the development environment, so for my case, can be enough), also this workaround works, but is not a fix to the problem.

By the way the proposed workaround works :

minikube ssh
sudo ip link set docker0 promisc on

This issue is still there on the 1.0.0 version. To me this is a major issue. Shall I create another issue or can someone reopen this one ?

… and v0.24.1

It seems that this was being tracked by a second subsequent bug (#2460) which I’ve de-duped into this one.

I agree that the behavior isn’t as users would expect, and would be more than happy to review any PR’s which address this. Help wanted!

@arrawatia Can you reopen this? Thanks.

can this be reopened please?

Below commands are NOT working for me. Still getting ‘0’ for cat /sys/devices/virtual/net/docker0/brif/veth*/hairpin_mode

minikube ssh
sudo ip link set docker0 promisc on

Any other solution or why it is not working while for others it is? Thanks.

minikube: 0.24.1 Kubernetes: 1.8 Win 10 Pro x64 Virtualbox 5.1.30

… and v0.24.0…

Fixing this by default appears to incur a ~30% performance penalty for startup, which makes me quite wary of imposing it on the users who do not care about CNI.

At a minimum though, we should document that it is possible to now say --cni=true to make this work.

@tstromberg thanks for resurrecting this.

the problem is still alive. Is surely a major issue.

Same. Auto-stale bots are a cancer, IMO

@nyetwurk: You can’t reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/reopen /remove-lifecycle rotten

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

Just updating the status for minikube 0.22.0 - the issue is still present

I’m going to open this up again, since we ended up reverting the kubenet change.

@arrawatia No problem, but maybe you should leave this issue open, so we’ll have a longer term fix merged in minkube. This is a bug, and my fix is just a workaround.