kubernetes: SkyDNS does not work when using it with Kubernetes Docker multinode setup
I’m running kubernetes docker containers on CentOS Linux release 7.2.1511, 3.10.0-327.10.1.el7.x86_64. I’m using the following script to run the kube-master link and the following script to run the SkyDNS link I’m updating the the places with {{ pillar }} to the actual values as suggested by the tutorial: http://kubernetes.io/docs/getting-started-guides/docker-multinode/deployDNS/
I’m using kubernetes 1.2.0-alpha.7
Client Version: version.Info{Major:“1”, Minor:“2+”, GitVersion:“v1.2.0-beta.0”, GitCommit:“50f7568d7f9b001c90ed75e79d41478afcd64a34”, GitTreeState:“clean”} Server Version: version.Info{Major:“1”, Minor:“2+”, GitVersion:“v1.2.0-alpha.7”, GitCommit:“c0fd002fbb25d6a6cd8427d28b8ec78379c354a0”, GitTreeState:“clean”}
[local@kube-master-1458129646 ~]$ kubectl cluster-info Kubernetes master is running at http://10.57.50.181:8080 KubeDNS is running at http://10.57.50.181:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
When accessing the following URL: http://10.57.50.181:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns I get the following:
{ "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "no endpoints available for service \"kube-dns\"", "reason": "ServiceUnavailable", "code": 503 }
[local@kube-master-1458129646 ~]$ kubectl describe svc Name: kubernetes Namespace: default Labels: component=apiserver,provider=kubernetes Selector: <none> Type: ClusterIP IP: 10.0.0.1 Port: https 443/TCP Endpoints: 10.57.50.181:6443 Session Affinity: None No events.
[local@kube-master-1458129646 ~]$ kubectl describe svc --namespace=kube-system Name: kube-dns Namespace: kube-system Labels: k8s-app=kube-dns,kubernetes.io/cluster-service=true,kubernetes.io/name=KubeDNS Selector: k8s-app=kube-dns Type: ClusterIP IP: 10.0.0.10 Port: dns 53/UDP Endpoints: Port: dns-tcp 53/TCP Endpoints: Session Affinity: None No events.
Here are the logs:
kube2sky logs
I0317 10:58:02.475909 1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001 I0317 10:58:03.478586 1 kube2sky.go:529] Using https://10.0.0.1:443 for kubernetes master I0317 10:58:03.478612 1 kube2sky.go:530] Using kubernetes API <nil> I0317 10:58:03.479278 1 kube2sky.go:598] Waiting for service: default/kubernetes I0317 10:58:04.480549 1 kube2sky.go:604] Ignoring error while waiting for service default/kubernetes: Get https://10.0.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.0.0.1:443: getsockopt: no route to host. Sleeping 1s before retrying. I0317 10:58:06.484404 1 kube2sky.go:604] Ignoring error while waiting for service default/kubernetes: Get https://10.0.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.0.0.1:443: getsockopt: no route to host. Sleeping 1s before retrying. I0317 10:58:08.488550 1 kube2sky.go:604] Ignoring error while waiting for service default/kubernetes: Get https://10.0.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.0.0.1:443: getsockopt: no route to host. Sleeping 1s before retrying.
skydns logs
2016/03/17 09:39:55 skydns: falling back to default configuration, could not read from etcd: 100: Key not found (/skydns) [3] 2016/03/17 09:39:55 skydns: ready for queries on cluster.local. for tcp://0.0.0.0:53 [rcache 0] 2016/03/17 09:39:55 skydns: ready for queries on cluster.local. for udp://0.0.0.0:53 [rcache 0] 2016/03/17 09:40:17 skydns: failure to forward request "read udp 10.56.190.1:53: no route to host" 2016/03/17 09:40:20 skydns: failure to forward request "read udp 10.56.190.1:53: i/o timeout" 2016/03/17 09:40:27 skydns: failure to forward request "read udp 10.56.190.1:53: i/o timeout" 2016/03/17 09:40:31 skydns: failure to forward request "read udp 10.56.190.1:53: i/o timeout" 2016/03/17 09:40:38 skydns: failure to forward request "read udp 10.56.190.1:53: i/o timeout" 2016/03/17 09:40:42 skydns: failure to forward request "read udp 10.56.190.1:53: i/o timeout" 2016/03/17 09:40:49 skydns: failure to forward request "read udp 10.56.190.1:53: i/o timeout"
etcd logs
2016-03-17 09:39:52.074261 I | etcdmain: etcd Version: 2.2.1 2016-03-17 09:39:52.074324 I | etcdmain: Git SHA: 75f8282 2016-03-17 09:39:52.074335 I | etcdmain: Go Version: go1.5.1 2016-03-17 09:39:52.074343 I | etcdmain: Go OS/Arch: linux/amd64 2016-03-17 09:39:52.074384 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8 2016-03-17 09:39:52.074974 I | etcdmain: listening for peers on http://localhost:2380 2016-03-17 09:39:52.075239 I | etcdmain: listening for peers on http://localhost:7001 2016-03-17 09:39:52.075304 I | etcdmain: listening for client requests on http://127.0.0.1:2379 2016-03-17 09:39:52.075406 I | etcdmain: listening for client requests on http://127.0.0.1:4001 2016-03-17 09:39:52.075813 I | etcdserver: name = default 2016-03-17 09:39:52.075829 I | etcdserver: data dir = /var/etcd/data 2016-03-17 09:39:52.075837 I | etcdserver: member dir = /var/etcd/data/member 2016-03-17 09:39:52.075844 I | etcdserver: heartbeat = 100ms 2016-03-17 09:39:52.075851 I | etcdserver: election = 1000ms 2016-03-17 09:39:52.075857 I | etcdserver: snapshot count = 10000 2016-03-17 09:39:52.075874 I | etcdserver: advertise client URLs = http://127.0.0.1:2379,http://127.0.0.1:4001 2016-03-17 09:39:52.075887 I | etcdserver: initial advertise peer URLs = http://localhost:2380,http://localhost:7001 2016-03-17 09:39:52.075906 I | etcdserver: initial cluster = default=http://localhost:2380,default=http://localhost:7001 2016-03-17 09:39:52.077919 I | etcdserver: starting member 6a5871dbdd12c17c in cluster f68652439e3f8f2a 2016-03-17 09:39:52.077997 I | raft: 6a5871dbdd12c17c became follower at term 0 2016-03-17 09:39:52.078027 I | raft: newRaft 6a5871dbdd12c17c [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0] 2016-03-17 09:39:52.078036 I | raft: 6a5871dbdd12c17c became follower at term 1 2016-03-17 09:39:52.078279 I | etcdserver: starting server... [version: 2.2.1, cluster version: to_be_decided] 2016-03-17 09:39:52.079411 N | etcdserver: added local member 6a5871dbdd12c17c [http://localhost:2380 http://localhost:7001] to cluster f68652439e3f8f2a 2016-03-17 09:39:52.878477 I | raft: 6a5871dbdd12c17c is starting a new election at term 1 2016-03-17 09:39:52.878553 I | raft: 6a5871dbdd12c17c became candidate at term 2 2016-03-17 09:39:52.878574 I | raft: 6a5871dbdd12c17c received vote from 6a5871dbdd12c17c at term 2 2016-03-17 09:39:52.878600 I | raft: 6a5871dbdd12c17c became leader at term 2 2016-03-17 09:39:52.878666 I | raft: raft.node: 6a5871dbdd12c17c elected leader 6a5871dbdd12c17c at term 2 2016-03-17 09:39:52.879294 I | etcdserver: setting up the initial cluster version to 2.2 2016-03-17 09:39:52.879432 I | etcdserver: published {Name:default ClientURLs:[http://127.0.0.1:2379 http://127.0.0.1:4001]} to cluster f68652439e3f8f2a 2016-03-17 09:39:52.880946 N | etcdserver: set the initial cluster version to 2.2 2016-03-17 11:03:10.579853 I | etcdserver: start to snapshot (applied: 10001, lastsnap: 0) 2016-03-17 11:03:10.582194 I | etcdserver: saved snapshot at index 10001 2016-03-17 11:03:10.582470 I | etcdserver: compacted raft log at 5001
I have exactly followed the tutorial: docker multinode setup, but I cannot make the SkyDNS work
What am I missing? Or it is a bug in the scritps?
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 26 (18 by maintainers)
Just to be clear about things I know:
Let me dive into my theory.
When people are using the moral equivalent of the turnup and turndown scripts, they create the containers and then destroy them (but don’t clear /var/lib/kubelet where all emptyDir and secrets mounts live). The setup-files.sh script writes to an empty dir which are shared by apiserver (the thing that verifies tokens) and controller-manager (the thing that mints and signs tokens). When kubelet starts a pod from an on-disk file it gets the same pod-id every time, so any emptyDirs that were mounted will have the files in there from before (because we didn’t kill /var/lib/kubelet).
My guess is:
An easy way to test this theory would be to have setup-files.sh exit early if the ca.crt and keys are already generated. I won’t have time over the next few days to work on this so if someone wants to try it out, I’d gladly help out in anyway I can.
BTW: I’m tracking what I believe is the underlying issue at #23197.