minikube: Pod unable to reach itself through a service (unless --cni=true is set)

Minikube version (use minikube version): minikube version: v0.16.0 and k8s version v1.6.4 (But I tried v0.17.1 and v0.19.1 too) Environment:

OS (e.g. from /etc/os-release):
VM Driver : Virtualbox
ISO version : minikube-v1.0.6.iso
Install tools: curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.16.0/minikube-darwin-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
Others:

What happened: If a pod has a service which points to the pod, the pod cannot reach itself through the service IP. Other pods can reach the service and the pod itself can reach other services. This means all components (especially clustered & distributed systems) which expect to talk to themselves for leader election fail to startup properly.

What you expected to happen: I expect the pod to be able to reach itself.

How to reproduce it (as minimally and precisely as possible): It happens with all our services and pods but I can reproduce it with kube-system pods too.

Get service IP : kubectl describe svc kube-dns --namespace kube-system | grep IP:. I get 10.0.0.10 Get endpoint IP: kubectl describe svc kube-dns --namespace kube-system | grep Endpoints. I get 172.17.0.3

Exec into the pod: kubectl --namespace kube-system exec -it kube-dns-v20-54536 sh

Run the following : Using the service IP hangs

Name: kubernetes-dashboard.kube-system.svc.cluster.local Address 1: 10.0.0.212 kubernetes-dashboard.kube-system.svc.cluster.local / # nslookup kubernetes-dashboard.kube-system.svc.cluster.local 10.0.0.10 Server: 10.0.0.10 ^C

Using the endoint IP works

/ # nslookup kubernetes-dashboard.kube-system.svc.cluster.local 172.17.0.3 Server: 172.17.0.3 Address 1: 172.17.0.3 kube-dns-v20-54536

Name: kubernetes-dashboard.kube-system.svc.cluster.local Address 1: 10.0.0.212 kubernetes-dashboard.kube-system.svc.cluster.local

Accessing a different service IP works. Using the kubernetes-dashboard IP from the last command / # telnet 10.0.0.212 80 get HTTP/1.1 400 Bad Request Content-Type: text/plain Connection: close

400 Bad RequestConnection closed by foreign host

Anything else do we need to know: minikube v0.17.1 works with K8S 1.5.3 I tried the following and it worked. So, I suspect it has something to do with upgrading minikube to v0.17.1 and K8S to v1.6.4

curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.17.1/minikube-darwin-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
rm -rf ~/.minikube
minikube start --kubernetes-version 1.5.3 --cpus 4 --memory 6096 --v=8 --logtostderr

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 10
Comments: 58 (17 by maintainers)

Commits related to this issue

Use kubenet, add ebtables support Fixes #1568 — committed to r2d4/minikube by r2d4 7 years ago

Most upvoted comments

For me, this helped to fix it https://github.com/kubernetes/kubernetes/issues/20475#issuecomment-190995739

So you can do:

minikube ssh
sudo ip link set docker0 promisc on

Maybe this fix can be merged directly in minikube so people won;t need to do custom things?

+48

ursuad on Jun 15, 2017

This is not solved, right? Saw it on minikube version: v0.28.2

+10

remoe on Aug 18, 2018

Encountered this bug on

minikube version: v1.3.1 commit: ca60a424ce69a4d79f502650199ca2b52f29e631

rihardsk on Aug 16, 2019

Still seeing this in minikube v0.23.0

chancez on Nov 8, 2017

@sgandon , yes, is workaround but is not a fix. I used as workaround an hostAliases in the deployment with the same name of the service (i use minikube in the development environment, so for my case, can be enough), also this workaround works, but is not a fix to the problem.

lzecca78 on Apr 10, 2019

By the way the proposed workaround works :

minikube ssh
sudo ip link set docker0 promisc on

sgandon on Apr 10, 2019

This issue is still there on the 1.0.0 version. To me this is a major issue. Shall I create another issue or can someone reopen this one ?

sgandon on Apr 4, 2019

… and v0.24.1

manikantag on Dec 19, 2017

It seems that this was being tracked by a second subsequent bug (#2460) which I’ve de-duped into this one.

I agree that the behavior isn’t as users would expect, and would be more than happy to review any PR’s which address this. Help wanted!

tstromberg on Apr 9, 2019

@arrawatia Can you reopen this? Thanks.

nyetwurk on Dec 6, 2018

can this be reopened please?

matlockx on Oct 15, 2018

Below commands are NOT working for me. Still getting ‘0’ for cat /sys/devices/virtual/net/docker0/brif/veth*/hairpin_mode

minikube ssh
sudo ip link set docker0 promisc on

Any other solution or why it is not working while for others it is? Thanks.

minikube: 0.24.1 Kubernetes: 1.8 Win 10 Pro x64 Virtualbox 5.1.30

manikantag on Dec 19, 2017

… and v0.24.0…

guilhermeblanco on Dec 13, 2017

Fixing this by default appears to incur a ~30% performance penalty for startup, which makes me quite wary of imposing it on the users who do not care about CNI.

At a minimum though, we should document that it is possible to now say --cni=true to make this work.

tstromberg on Aug 30, 2020

@tstromberg thanks for resurrecting this.

nyetwurk on Apr 10, 2019

the problem is still alive. Is surely a major issue.

lzecca78 on Apr 9, 2019

Same. Auto-stale bots are a cancer, IMO

nyetwurk on Apr 10, 2019

@nyetwurk: You can’t reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on Dec 6, 2018

/reopen /remove-lifecycle rotten

nyetwurk on Dec 6, 2018

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

fejta-bot on Jul 15, 2018

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale

fejta-bot on Jun 14, 2018

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot on May 15, 2018

Just updating the status for minikube 0.22.0 - the issue is still present

wstrange on Sep 7, 2017

I’m going to open this up again, since we ended up reverting the kubenet change.

r2d4 on Aug 30, 2017

@arrawatia No problem, but maybe you should leave this issue open, so we’ll have a longer term fix merged in minkube. This is a bug, and my fix is just a workaround.

ursuad on Jun 20, 2017

It seems it’s related to this: https://github.com/kubernetes/kubernetes/issues/19930 and this https://github.com/kubernetes/kubernetes/issues/20475

ursuad on Jun 15, 2017