kubeadm: weave-net CrashLoopBackOff for the second node

From @avkonst on October 5, 2016 14:12

Is this a request for help?

I think it is an issue either with software or documentation, but I am not quite sure. I have started with a question on stackoverflow: http://stackoverflow.com/questions/39872332/how-to-fix-weave-net-crashloopbackoff-for-the-second-node

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

I think it is a bug or request to improve documentation

Kubernetes version (use kubectl version): 1.4.0

Environment:

  • Cloud provider or hardware configuration: Vagrant
  • OS (e.g. from /etc/os-release): Ubuntu 16.04
  • Kernel (e.g. uname -a):
  • Install tools: kubeadm init/join
  • Others:

What happened:

I have got 2 VMs nodes. Both see each other either by hostname (through /etc/hosts) or by ip address. One has been provisioned with kubeadm as a master. Another as a worker node. Following the instructions (http://kubernetes.io/docs/getting-started-guides/kubeadm/) I have added weave-net. The list of pods looks like the following:

vagrant@vm-master:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                    READY     STATUS             RESTARTS   AGE
kube-system   etcd-vm-master                          1/1       Running            0          3m
kube-system   kube-apiserver-vm-master                1/1       Running            0          5m
kube-system   kube-controller-manager-vm-master       1/1       Running            0          4m
kube-system   kube-discovery-982812725-x2j8y          1/1       Running            0          4m
kube-system   kube-dns-2247936740-5pu0l               3/3       Running            0          4m
kube-system   kube-proxy-amd64-ail86                  1/1       Running            0          4m
kube-system   kube-proxy-amd64-oxxnc                  1/1       Running            0          2m
kube-system   kube-scheduler-vm-master                1/1       Running            0          4m
kube-system   kubernetes-dashboard-1655269645-0swts   1/1       Running            0          4m
kube-system   weave-net-7euqt                         2/2       Running            0          4m
kube-system   weave-net-baao6                         1/2       CrashLoopBackOff   2          2m

CrashLoopBackOff appears for each worker node connected. I have spent several ours playing with network interfaces, but it seems the network is fine. I have found similar question on stackoverflow, where the answer advised to look into the logs and no follow up. So, here are the logs:

vagrant@vm-master:~$ kubectl logs weave-net-baao6 -c weave --namespace=kube-system
2016-10-05 10:48:01.350290 I | error contacting APIServer: Get https://100.64.0.1:443/api/v1/nodes: dial tcp 100.64.0.1:443: getsockopt: connection refused; trying with blank env vars
2016-10-05 10:48:01.351122 I | error contacting APIServer: Get http://localhost:8080/api: dial tcp [::1]:8080: getsockopt: connection refused
Failed to get peers

What you expected to happen:

I would expect the weave-net to be in Running state

How to reproduce it (as minimally and precisely as possible):

I have not done anything special, just followed the documentation on Getting Started. If it is essencial, I can share Vagrant project, which I used to provision everything. Please, let me know if you need one.

Copied from original issue: kubernetes/kubernetes#34101

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 42 (39 by maintainers)

Most upvoted comments

Is that run from a node? Then you need to use the /etc/kubernetes/kubelet.conf kubeconfig file

From @petergardfjall on October 19, 2016 5:43

@miry I did restart the pod with no luck (by the way, when you say restart a pod, I assume you mean deleting the pod (and have the replica set replace it), right)?

It is stuck in ContainerCreating, although it appears to have reached slightly farther, beeing able to at least create containers. However

# kubectl logs -n kube-system kube-dns-2247936740-f8nxa kube-dns
Error from server: Get https://kube-slave2:10250/containerLogs/kube-system/kube-dns-2247936740-f8nxa/kube-dns: dial tcp: lookup kube-slave2 on 10.0.2.3:53: no such host

But I guess I’ll follow @errordeveloper’s advice and take it on the slack channel.