weave: weave-net pod fails to get peers after Kubernetes v1.12.0 upgrade

What you expected to happen?

Weave fills automatically the KUBE_PEERS env var and the connection between peers can initiate.

What happened?

After upgrading a working v1.11.3 Kubernetes cluster to v1.12.0, the weave container in of the two weave pods fails to obtain the peer list and enter the CrashLoopBackOff state, only logging Failed to get peers. The weave-npc container fails to contact 10.96.0.1 (kube-api) and every list issued fail with a timeout. Manually entering the peers in the daemonset allows the connection between weave pods to initiate successfully.

How to reproduce it?

Upgrade K8 cluster from v1.11.3 to v1.12.0 with kubeadm.

Anything else we need to know?

The cluster is running on bare metal with a single node and control plane. I also upgraded cni from 0.6.0 to 0.7.1. I have specified the cluster-cidr in the kube-proxy daemonset ( - --cluster-cidr=10.32.0.0/12 ).

Versions:

$ weave version
weave script 2.4.1
weave 2.4.1

$ docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:16:31 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:16:31 2018
  OS/Arch:          linux/amd64
  Experimental:     false

$ uname -a
Linux server-01 4.14.67-coreos #1 SMP Mon Sep 10 23:14:26 UTC 2018 x86_64 ...

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.0", GitCommit:"0ed33881dc4355495f623c6f22e7dd0b7632b7c0", GitTreeState:"clean", BuildDate:"2018-09-27T17:05:32Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.0", GitCommit:"0ed33881dc4355495f623c6f22e7dd0b7632b7c0", GitTreeState:"clean", BuildDate:"2018-09-27T16:55:41Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

Logs:

# On the failing pod

$ kubectl -n kube-system logs -f weave-net-c7nfb weave
Failed to get peers

# On the running pod

$ kubectl -n kube-system logs -f weave-net-47gdt weave
DEBU: 2018/09/28 10:01:22.164268 [kube-peers] Checking peer "b2:de:01:96:13:b4" against list &{[{b2:de:01:96:13:b4 server-01} {2e:53:d1:b4:0c:dd server-02}]}
INFO: 2018/09/28 10:01:22.358106 Command line options: map[no-dns:true db-prefix:/weavedb/weave-net host-root:/host mtu:8900 nickname:server-01 no-masq-local:true conn-limit:100 docker-api: ipalloc-init:consensus=2 metrics-addr:0.0.0.0:6782 datapath:datapath expect-npc:true ipalloc-range:10.32.0.0/12 name:b2:de:01:96:13:b4 http-addr:127.0.0.1:6784 log-level:debug port:6783]
INFO: 2018/09/28 10:01:22.358229 weave  2.4.1
INFO: 2018/09/28 10:01:22.742314 Re-exposing 10.40.0.0/12 on bridge "weave"
INFO: 2018/09/28 10:01:22.810321 Bridge type is bridged_fastdp
INFO: 2018/09/28 10:01:22.810369 Communication between peers is unencrypted.
INFO: 2018/09/28 10:01:22.965628 Our name is b2:de:01:96:13:b4(server-01)
INFO: 2018/09/28 10:01:22.965673 Launch detected - using supplied peer list: [192.168.20.7 192.168.20.6]
INFO: 2018/09/28 10:01:22.965708 Using "no-masq-local" LocalRangeTracker
INFO: 2018/09/28 10:01:22.965716 Checking for pre-existing addresses on weave bridge
INFO: 2018/09/28 10:01:22.965918 weave bridge has address 10.40.0.0/12
INFO: 2018/09/28 10:01:22.998075 Found address 10.40.0.9/12 for ID _
INFO: 2018/09/28 10:01:22.998406 Found address 10.40.0.9/12 for ID _
INFO: 2018/09/28 10:01:23.000696 Found address 10.40.0.11/12 for ID _
INFO: 2018/09/28 10:01:23.001317 Found address 10.40.0.11/12 for ID _
INFO: 2018/09/28 10:01:23.004646 Found address 10.40.0.9/12 for ID _
INFO: 2018/09/28 10:01:23.020759 Found address 10.40.0.6/12 for ID _
INFO: 2018/09/28 10:01:23.020895 Found address 10.40.0.7/12 for ID _
INFO: 2018/09/28 10:01:23.021175 Found address 10.40.0.6/12 for ID _
INFO: 2018/09/28 10:01:23.021322 Found address 10.40.0.5/12 for ID _
INFO: 2018/09/28 10:01:23.021439 Found address 10.40.0.7/12 for ID _
INFO: 2018/09/28 10:01:23.021576 Found address 10.40.0.5/12 for ID _
INFO: 2018/09/28 10:01:23.021720 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.021852 Found address 10.40.0.10/12 for ID _
INFO: 2018/09/28 10:01:23.021997 Found address 10.40.0.11/12 for ID _
INFO: 2018/09/28 10:01:23.022107 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.022220 Found address 10.40.0.10/12 for ID _
INFO: 2018/09/28 10:01:23.022718 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.022817 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023192 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023292 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023387 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023481 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023574 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023661 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023750 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023849 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.023951 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.024045 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.024137 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.024234 Found address 10.40.0.8/12 for ID _
INFO: 2018/09/28 10:01:23.024349 Found address 10.40.0.10/12 for ID _
INFO: 2018/09/28 10:01:23.024438 Found address 10.40.0.10/12 for ID _
INFO: 2018/09/28 10:01:23.024536 Found address 10.40.0.10/12 for ID _
INFO: 2018/09/28 10:01:23.024629 Found address 10.40.0.10/12 for ID _
INFO: 2018/09/28 10:01:23.024718 Found address 10.40.0.10/12 for ID _
INFO: 2018/09/28 10:01:23.026770 adding entry 10.40.0.0/13 to weaver-no-masq-local of 
INFO: 2018/09/28 10:01:23.026787 added entry 10.40.0.0/13 to weaver-no-masq-local of 
INFO: 2018/09/28 10:01:23.027991 [allocator b2:de:01:96:13:b4] Initialising with persisted data
DEBU: 2018/09/28 10:01:23.028082 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.0/12 for weave:expose
DEBU: 2018/09/28 10:01:23.028130 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.9/12 for ID _ having existing ID as 5704052a8c81ca8e844d6480d3175623e26af7c346a1bbfa561b311db60d0ca1
DEBU: 2018/09/28 10:01:23.028161 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.9/12 for ID _ having existing ID as 5704052a8c81ca8e844d6480d3175623e26af7c346a1bbfa561b311db60d0ca1
DEBU: 2018/09/28 10:01:23.028184 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.11/12 for ID _ having existing ID as 21c3fe888aa6fdd8ee571f727662fe88f7f65b46b0808cd880ed696c8bcb817e
DEBU: 2018/09/28 10:01:23.028204 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.11/12 for ID _ having existing ID as 21c3fe888aa6fdd8ee571f727662fe88f7f65b46b0808cd880ed696c8bcb817e
DEBU: 2018/09/28 10:01:23.028223 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.9/12 for ID _ having existing ID as 5704052a8c81ca8e844d6480d3175623e26af7c346a1bbfa561b311db60d0ca1
DEBU: 2018/09/28 10:01:23.028242 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.6/12 for ID _ having existing ID as 5fd80d9d62e2cdecd8070846bb45725f4476c36aed63ba61fbb34169bfecf733
DEBU: 2018/09/28 10:01:23.028261 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.7/12 for ID _ having existing ID as bbe64a5def512cdb2a8f22740d80c3680067efa471f4a8fd526cc89d3c0112f2
DEBU: 2018/09/28 10:01:23.028279 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.6/12 for ID _ having existing ID as 5fd80d9d62e2cdecd8070846bb45725f4476c36aed63ba61fbb34169bfecf733
DEBU: 2018/09/28 10:01:23.028299 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.5/12 for ID _ having existing ID as dfe8a081908b7317e57113ca4f4bd3764bd2cc0c45cd3f1dbb971be1bec460df
DEBU: 2018/09/28 10:01:23.028319 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.7/12 for ID _ having existing ID as bbe64a5def512cdb2a8f22740d80c3680067efa471f4a8fd526cc89d3c0112f2
DEBU: 2018/09/28 10:01:23.028337 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.5/12 for ID _ having existing ID as dfe8a081908b7317e57113ca4f4bd3764bd2cc0c45cd3f1dbb971be1bec460df
DEBU: 2018/09/28 10:01:23.028356 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028384 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.10/12 for ID _ having existing ID as b92578e6ab2d5aeecc8f37a8ca386448ce5e237abecef56bc5144b77b179c44d
DEBU: 2018/09/28 10:01:23.028404 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.11/12 for ID _ having existing ID as 21c3fe888aa6fdd8ee571f727662fe88f7f65b46b0808cd880ed696c8bcb817e
DEBU: 2018/09/28 10:01:23.028422 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028442 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.10/12 for ID _ having existing ID as b92578e6ab2d5aeecc8f37a8ca386448ce5e237abecef56bc5144b77b179c44d
DEBU: 2018/09/28 10:01:23.028462 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028480 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028498 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028516 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028535 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028558 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028584 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028603 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028627 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028658 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028678 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028701 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028722 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028741 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.8/12 for ID _ having existing ID as 589631e93ea890e3dc0ed144e95427a98ba4f6d7a78de4ca23bd7621bf114cd2
DEBU: 2018/09/28 10:01:23.028760 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.10/12 for ID _ having existing ID as b92578e6ab2d5aeecc8f37a8ca386448ce5e237abecef56bc5144b77b179c44d
DEBU: 2018/09/28 10:01:23.028779 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.10/12 for ID _ having existing ID as b92578e6ab2d5aeecc8f37a8ca386448ce5e237abecef56bc5144b77b179c44d
DEBU: 2018/09/28 10:01:23.028798 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.10/12 for ID _ having existing ID as b92578e6ab2d5aeecc8f37a8ca386448ce5e237abecef56bc5144b77b179c44d
DEBU: 2018/09/28 10:01:23.028822 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.10/12 for ID _ having existing ID as b92578e6ab2d5aeecc8f37a8ca386448ce5e237abecef56bc5144b77b179c44d
DEBU: 2018/09/28 10:01:23.028841 [allocator b2:de:01:96:13:b4]: Re-Claimed 10.40.0.10/12 for ID _ having existing ID as b92578e6ab2d5aeecc8f37a8ca386448ce5e237abecef56bc5144b77b179c44d
INFO: 2018/09/28 10:01:23.028929 Sniffing traffic on datapath (via ODP)
INFO: 2018/09/28 10:01:23.029369 ->[192.168.20.6:6783] attempting connection
INFO: 2018/09/28 10:01:23.029515 ->[192.168.20.7:6783] attempting connection
INFO: 2018/09/28 10:01:23.029686 ->[192.168.20.6:51265] connection accepted
INFO: 2018/09/28 10:01:23.030007 ->[192.168.20.7:6783] error during connection attempt: dial tcp4 :0->192.168.20.7:6783: connect: connection refused
INFO: 2018/09/28 10:01:23.030492 ->[192.168.20.6:6783|b2:de:01:96:13:b4(server-01)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/09/28 10:01:23.030666 ->[192.168.20.6:51265|b2:de:01:96:13:b4(server-01)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/09/28 10:01:23.033900 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2018/09/28 10:01:23.035258 Listening for metrics requests on 0.0.0.0:6782
DEBU: 2018/09/28 10:01:23.114862 fastdp: broadcast{1 false} {b2:de:01:96:13:b4 ff:ff:ff:ff:ff:ff}
DEBU: 2018/09/28 10:01:23.114946 Discovered local MAC b2:de:01:96:13:b4
DEBU: 2018/09/28 10:01:23.115104 Creating ODP flow FlowSpec{keys: [InPortFlowKey{vport: 1} EthernetFlowKey{src: b2:de:01:96:13:b4, dst: ff:ff:ff:ff:ff:ff}], actions: [OutputAction{vport: 0}]}
DEBU: 2018/09/28 10:01:23.214435 [http] GET /status
DEBU: 2018/09/28 10:01:24.406060 ODP miss: map[22:UnknownFlowKey{type: 22, key: 00000000, mask: exact} 24:UnknownFlowKey{type: 24, key: 00000000, mask: exact} 4:EthernetFlowKey{src: 2e:53:d1:b4:0c:dd, dst: ff:ff:ff:ff:ff:ff} 20:BlobFlowKey{type: 20, key: 00000000, mask: ffffffff} 13:BlobFlowKey{type: 13, key: 0a2000010a28000500012e53d1b40cdd0000000000000000, mask: ffffffffffffffffffffffffffffffffffffffffffffffff} 2:BlobFlowKey{type: 2, key: 00000000, mask: ffffffff} 23:UnknownFlowKey{type: 23, key: 0000, mask: exact} 19:BlobFlowKey{type: 19, key: 00000000, mask: ffffffff} 25:UnknownFlowKey{type: 25, key: 00000000000000000000000000000000, mask: exact} 6:BlobFlowKey{type: 6, key: 0806, mask: ffff} 3:InPortFlowKey{vport: 2} 15:BlobFlowKey{type: 15, key: 00000000, mask: ffffffff} 16:TunnelFlowKey{id: 0000000000f4f4d8, ipv4src: 192.168.20.7, ipv4dst: 192.168.20.6, ttl: 64, tpsrc: 48884, tpdst: 6784}] on port 2
INFO: 2018/09/28 10:01:25.416266 ->[192.168.20.7:6783] attempting connection
INFO: 2018/09/28 10:01:25.416933 ->[192.168.20.7:6783] error during connection attempt: dial tcp4 :0->192.168.20.7:6783: connect: connection refused
DEBU: 2018/09/28 10:01:25.465323 ODP miss: map[16:TunnelFlowKey{id: 0000000000f4f4d8, ipv4src: 192.168.20.7, ipv4dst: 192.168.20.6, ttl: 64, tpsrc: 48884, tpdst: 6784} 25:UnknownFlowKey{type: 25, key: 00000000000000000000000000000000, mask: exact} 6:BlobFlowKey{type: 6, key: 0806, mask: ffff} 22:UnknownFlowKey{type: 22, key: 00000000, mask: exact} 15:BlobFlowKey{type: 15, key: 00000000, mask: ffffffff} 4:EthernetFlowKey{src: 2e:53:d1:b4:0c:dd, dst: ff:ff:ff:ff:ff:ff} 20:BlobFlowKey{type: 20, key: 00000000, mask: ffffffff} 19:BlobFlowKey{type: 19, key: 00000000, mask: ffffffff} 23:UnknownFlowKey{type: 23, key: 0000, mask: exact} 24:UnknownFlowKey{type: 24, key: 00000000, mask: exact} 3:InPortFlowKey{vport: 2} 13:BlobFlowKey{type: 13, key: 0a2000010a28000500012e53d1b40cdd0000000000000000, mask: ffffffffffffffffffffffffffffffffffffffffffffffff} 2:BlobFlowKey{type: 2, key: 00000000, mask: ffffffff}] on port 2
DEBU: 2018/09/28 10:01:26.489271 ODP miss: map[24:UnknownFlowKey{type: 24, key: 00000000, mask: exact} 4:EthernetFlowKey{src: 2e:53:d1:b4:0c:dd, dst: ff:ff:ff:ff:ff:ff} 22:UnknownFlowKey{type: 22, key: 00000000, mask: exact} 25:UnknownFlowKey{type: 25, key: 00000000000000000000000000000000, mask: exact} 20:BlobFlowKey{type: 20, key: 00000000, mask: ffffffff} 19:BlobFlowKey{type: 19, key: 00000000, mask: ffffffff} 16:TunnelFlowKey{id: 0000000000f4f4d8, ipv4src: 192.168.20.7, ipv4dst: 192.168.20.6, ttl: 64, tpsrc: 48884, tpdst: 6784} 23:UnknownFlowKey{type: 23, key: 0000, mask: exact} 2:BlobFlowKey{type: 2, key: 00000000, mask: ffffffff} 3:InPortFlowKey{vport: 2} 15:BlobFlowKey{type: 15, key: 00000000, mask: ffffffff} 6:BlobFlowKey{type: 6, key: 0806, mask: ffff} 13:BlobFlowKey{type: 13, key: 0a2000010a28000500012e53d1b40cdd0000000000000000, mask: ffffffffffffffffffffffffffffffffffffffffffffffff}] on port 2
DEBU: 2018/09/28 10:01:29.406274 ODP miss: map[25:UnknownFlowKey{type: 25, key: 00000000000000000000000000000000, mask: exact} 13:BlobFlowKey{type: 13, key: 0a2000010a28000500012e53d1b40cdd0000000000000000, mask: ffffffffffffffffffffffffffffffffffffffffffffffff} 2:BlobFlowKey{type: 2, key: 00000000, mask: ffffffff} 22:UnknownFlowKey{type: 22, key: 00000000, mask: exact} 24:UnknownFlowKey{type: 24, key: 00000000, mask: exact} 23:UnknownFlowKey{type: 23, key: 0000, mask: exact} 6:BlobFlowKey{type: 6, key: 0806, mask: ffff} 20:BlobFlowKey{type: 20, key: 00000000, mask: ffffffff} 4:EthernetFlowKey{src: 2e:53:d1:b4:0c:dd, dst: ff:ff:ff:ff:ff:ff} 19:BlobFlowKey{type: 19, key: 00000000, mask: ffffffff} 16:TunnelFlowKey{id: 0000000000f4f4d8, ipv4src: 192.168.20.7, ipv4dst: 192.168.20.6, ttl: 64, tpsrc: 48884, tpdst: 6784} 3:InPortFlowKey{vport: 2} 15:BlobFlowKey{type: 15, key: 00000000, mask: ffffffff}] on port 2
INFO: 2018/09/28 10:01:29.559178 ->[192.168.20.7:6783] attempting connection
INFO: 2018/09/28 10:01:29.559825 ->[192.168.20.7:6783] error during connection attempt: dial tcp4 :0->192.168.20.7:6783: connect: connection refused

Network:

$ ip route # node with failed weave pod
default via <redacted>.123.1 dev vlan1 proto static 
10.32.0.0/12 dev weave proto kernel scope link src 10.32.0.1 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.20.0/24 dev vlan20 proto kernel scope link src 192.168.20.7 
192.168.34.0/24 dev vlan1 proto kernel scope link src 192.168.34.254 
<redacted>.123.0/24 dev vlan1 proto kernel scope link src <redacted>.123.160

$ ip route # control plane with running weave pod
default via <redacted>.123.1 dev vlan1 proto static 
10.32.0.0/12 dev weave proto kernel scope link src 10.40.0.0 
169.254.95.0/24 dev enp0s20f0u1u6 proto kernel scope link src 169.254.95.120 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 
192.168.20.0/24 dev vlan20 proto kernel scope link src 192.168.20.6 
192.168.34.0/24 dev vlan1 proto kernel scope link src 192.168.34.253 
<redacted>.123.0/24 dev vlan1 proto kernel scope link src <redacted>.123.161 

$ ip -4 -o addr # node with failed weave pod
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
7: vlan1    inet <redacted>.123.160/24 brd <redacted>.123.255 scope global vlan1\       valid_lft forever preferred_lft forever
7: vlan1    inet 192.168.34.254/24 brd 192.168.34.255 scope global vlan1\       valid_lft forever preferred_lft forever
8: vlan20    inet 192.168.20.7/24 brd 192.168.20.255 scope global vlan20\       valid_lft forever preferred_lft forever
9: docker0    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0\       valid_lft forever preferred_lft forever
12: weave    inet 10.32.0.1/12 brd 10.47.255.255 scope global weave\       valid_lft forever preferred_lft forever

$ ip -4 -o addr # control plane with running weave pod
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
7: enp0s20f0u1u6    inet 169.254.95.120/24 brd 169.254.95.255 scope global dynamic enp0s20f0u1u6\       valid_lft 499sec preferred_lft 499sec
8: vlan20    inet 192.168.20.6/24 brd 192.168.20.255 scope global vlan20\       valid_lft forever preferred_lft forever
9: vlan1    inet <redacted>.123.161/24 brd <redacted>.123.255 scope global vlan1\       valid_lft forever preferred_lft forever
9: vlan1    inet 192.168.34.253/24 brd 192.168.34.255 scope global vlan1\       valid_lft forever preferred_lft forever
12: docker0    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0\       valid_lft forever preferred_lft forever
62: weave    inet 10.40.0.0/12 brd 10.47.255.255 scope global weave\       valid_lft forever preferred_lft forever

$ sudo iptables-save
# Would be too long to paste here, tell me if this is really needed there.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 24 (13 by maintainers)

Most upvoted comments

After investigating found what was causing the problem in v1.12.1. As stated in (https://github.com/kubernetes/kubeadm/issues/102#issuecomment-370753273) the following iptable rule was missing in the new version: iptables -t nat -I KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ

Not sure how / when / if that rule is necessary but after adding that it started working. I’d also love to know where K8s networking is using that MARK to forward the packet to the correct interface. I haven’t found any routing tables that actually uses that mark.