cri-o: cri-o w/ Kubernetes v1.14 connects pods to wrong subnets when using Weave or kube-router or Flannel CNI plugin

Description with k8s v1.13 and v1.14, after k8s installation pods are reproduceably connected to wrong subnets:

Steps to reproduce the issue: *** provision Kubernetes via kubeadm *** use cri-o as runtime together with the Weave or kube-router CNI plugin.

Describe the results you received: *** on VirtualBox based VMs, according to “kubectl get po -w” static pods are connected to the NATted VirtualBox default NIC eth0 although a dedicated “host-only” NIC eth1 is configured for each node VM, and its IP address specified for k8s via kubeadm (see below) This does in our experience not impair the cluster’s functionality (indicating this is in fact a display/reporting error, i.e. the pods are only shown on the wrong subnet?). *** Much worse is that in both environments, non-static pods (i.e. CoreDNS which is deployed by kubeadm as default) are connected to the IP network that is configured as “crio-bridge” in /etc/cni/net.d/100-crio-bridge instead to the pod subnet specfied to kubeadm. This of course makes the cluster services (which are generated on the pod or service subnets) fail due to the endpoint IPs not matching the pod IPs.

Describe the results you expected: In the case of the VirtualBox based nodes, static pods managed by the kubelets should be connected to the host-only dediacted NIC. In both cases, pods managed by k8s should be connected to the pod or service subnets specified to kubeadm.

Additional information you deem important (e.g. issue happens only occasionally): *** Weird is that during startup of a cluster node, some of the static pods (e.g. kube-proxy) are shown temporarily with correct eth1 IP addressses for a short time – but once k8s has booted completely all of the static pods have the wrong IPs.

*** We have found a workaround for both problems is to directly after cluster installation

restart first cri-o and then kubelet via systemd on each cluster node (his fixes the wrong IPs of static pods, and has to be repeated after each reboot of a node)
then, delete the CNI plugin pods (i.e. Weave or kube-router) on each cluster node
repeat restart of k8s related host services as described above
finally, delete the non-static pods that are connected to the CNI bridge subnet (i.e. CoreDNS). The deleted pods are restarted by their controllers, where the non-static pods are now on the correct pod-subnet, and the cluster works fine (pod deletion needs not to be repeated after cluster node reboots)

*** Our questions are: **** Is cri-o expected to work in the scenario described above? We had a problem report opened against kubeadm (see https://github.com/kubernetes/kubeadm/issues/1363), but the comments there seem to indicate that kubeadm + cri-o should theoretically work, but is a bit of a “fringe combination” from the perspective of the kubeadm community **** If cri-o should work, which version should be used with k8s v1.14? The CentOS yum repositories offer RPMs with configuration and systemd support, but these seem to be quite outdated (v1.11). **** If we follow the release-1.14 branch of cri-o, is it enough to generate /etc/crio/crio.conf with the cri-o version that is installed (see below), or has the CNI plugin configuration under /etc/cni also to be updated? **** Which of the k8s CNI plugins work out of the box with cri-o? E.g. the weave utility script fails to reset the CNI configuration, it seems to be tied tightly to a running Docker daemon.

Output of crio --version: We tried several versions of cri-o, and currently use

crio version 1.14.2-dev
commit: "238229ea3b49a1add6e14e577e6e9e208ec0cf57"

Additional environment details (AWS, VirtualBox, physical, etc.): Our cluster nodes are VMs running CentOS 7 under VMWare and VirtualBox hypervisors. The problems described above arise whith both Weave 2.5.2, and kube-router 0.3.1 CNI plugins

kubeadm init configuration (for VirtualBox setup with the “host-only” adapter configured to be attached to the 192.168.99 subnet):

---
apiVersion: kubeadm.k8s.io/v1beta1
clusterName: k8m.qstage.dpt.suborg.int.com
controlPlaneEndpoint: 192.168.99.10
kind: ClusterConfiguration
kubernetesVersion: v1.14.2
networking:
  dnsDomain: qstage.dpt.suborg.int.com
  podSubnet: 172.18.0.0/16
  serviceSubnet: 172.19.0.0/16
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clusterCIDR: 172.18.0.0/16
kind: KubeProxyConfiguration
---
apiVersion: kubelet.config.k8s.io/v1beta1
clusterDomain: qstage.dpt.suborg.int.com
cgroupDriver: systemd
clusterDNS:
- 172.19.0.10
evictionHard:
  imagefs.available: 5%
  nodefs.available: 5%
evictionSoft:
  imagefs.available: 5%
  nodefs.available: 5%
evictionSoftGracePeriod:
  imagefs.available: 10h
  nodefs.available: 10h
evictionPressureTransitionPeriod: 10h
hairpinMode: hairpin-veth
kind: KubeletConfiguration
podCIDR: 172.18.0.0/16
runtimeCgroups: /systemd/system.slice
kubeletCgroups: /systemd/system.slice
---
apiVersion: kubeadm.k8s.io/v1beta1
nodeRegistration:
  criSocket: unix:///var/run/crio/crio.sock
  name: k8m
  kubeletExtraArgs:
    runtime-cgroups: /systemd/system.slice
  taints: []
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.99.10

cri-o bridge configuration in /etc/cni/net.c/100-crio-bridge (installed by CentOS cri-o RPM):

{
    "cniVersion": "0.3.0",
    "name": "crio-bridge",
    "type": "bridge",
    "bridge": "cni0",
    "isGateway": true,
    "ipMasq": true,
    "ipam": {
        "type": "host-local",
        "subnet": "10.88.0.0/16",
        "routes": [
            { "dst": "0.0.0.0/0" }
        ]
    }
}

cri-o configuration in /etc/crio/crio.conf is generated by

crio --config /etc/crio/crio.conf
     --cgroup-manager systemd --cni-plugin-dir /opt/cni/bin --conmon /usr/libexec/crio/conmon
     --log-journald --selinux=false config

Weave setup is standard, i.e. copied from

curl --header "Accept: application/json" "https://cloud.weave.works/k8s/v1.10/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" | jq .

except that the pod subnet is specified in the deployment YAML via setting the IPALLOC_RANGE environmment variable.

Log messages:

for the master node, k8s logs an event

CRI error: /sys is read-only: cannot modify conntrack limits, problems may arise later (If running Docker, see docker issue #24000)

cri-o logs

May 22 10:03:05 k8m crio[2709]: weave-cni: error removing interface "eth0": Link not found
May 22 10:03:05 k8m crio[2709]: weave-cni: unable to release IP address: Delete http://127.0.0.1:6784/ip/8bf721ff3fd6fdaa712339c8f2f53b377f0dfe47e7ea8387ccaac27f47b377c6: dial tcp 127.0.0.1:6784: connect: connection refused
May 22 10:03:05 k8m crio[2709]: time="2019-05-22 10:03:05.122774911+02:00" level=error msg="Error deleting network: Delete http://127.0.0.1:6784/ip/8bf721ff3fd6fdaa712339c8f2f53b377f0dfe47e7ea8387ccaac27f47b377c6: dial tcp 127.0.0.1:6784: connect: connection refused"
May 22 10:03:05 k8m crio[2709]: time="2019-05-22 10:03:05.122791969+02:00" level=error msg="Error while removing pod from CNI network "weave": Delete http://127.0.0.1:6784/ip/8bf721ff3fd6fdaa712339c8f2f53b377f0dfe47e7ea8387ccaac27f47b377c6: dial tcp 127.0.0.1:6784: connect: connection refused"
May 22 10:03:07 k8m crio[2709]: time="2019-05-22 10:03:07.020894336+02:00" level=error msg="Error adding network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/5f543bfb4e21e5d1b467b2475f9b2af937da500c31d04b6baad1c52ec5fb66c6: dial tcp 127.0.0.1:6784: connect: connection refused"
May 22 10:03:07 k8m crio[2709]: time="2019-05-22 10:03:07.020915201+02:00" level=error msg="Error while adding pod to CNI network "weave": unable to allocate IP address: Post http://127.0.0.1:6784/ip/5f543bfb4e21e5d1b467b2475f9b2af937da500c31d04b6baad1c52ec5fb66c6: dial tcp 127.0.0.1:6784: connect: connection refused"
M

shows up several times in the journal of the master node.

In the nodes systemd journal, kubelet logs many messages

node9.qstage.dpt.suborg.int.com kubelet[61333]: E0528 14:29:44.672017   61333 manager.go:1181] Failed to create existing container:
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod89439d2e_7d5e_11e9_bb2f_0050569516ef.slice/crio-6e60080685c8718d644d16a5fd36e410c229d1d51852b6c2ea3647bcbd22f999.scope:
invalid character 'c' looking for beginning of value

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 3
Comments: 32 (11 by maintainers)

Most upvoted comments

I solved this by changing the plugin_dirs field in /etc/crio/crio.conf

before change

plugin_dirs = [
        "/usr/libexec/cni",
]

after change

plugin_dirs = [
        "/usr/libexec/cni",
        "/opt/cni/bin/"
]

then reboot

Now the node connected to the correct pod network, which is 10.32.0.0/12 by default in weave-net.

Notes

I tried to restart the crio and kubelet service to avoid reboot, but it seems not working. So I just reboot to get everything clean.
I suggest deleting the 87-podman-bridge.conflist and something other than 10-weave.conflist.

System environment

OS: CentOS 7.7
crio version: 1.15.2-1.el7
kubelet/kubeadm version: 1.16.3

YanzheL on Nov 25, 2019

So I was having a similar issue with the Calico CNI. I think I have resolved this issue by simply removing /etc/cni/net.d/100-crio-bridge.conf and /etc/cni/net.d/200-loopback.conf right after installing the cri-o RPM.

Then, when the Calico CNI gets installed as a daemonset, it writes the correct CNI config to /etc/cni/net.d and things seem to work. Can someone comment on why these files are there in the first place?

matthias50 on Oct 9, 2019

Ok so we started a cluster w/o a CRI-O CNI bridge configured – this resulted in kubelet error messages

kubelet[2544]: E0724 15:06:29.459685 2544 pod_workers.go:190] Error syncing pod e9647c08-9ac4-4fdd-8056-4852b57f7658 (“coredns-5c98db65d4-bpw6m_kube-system(e9647c08-9ac4-4fdd-8056-4852b57f7658)”), skipping: failed to “CreatePodSandbox” for “coredns-5c98db65d4-bpw6m_kube-system(e9647c08-9ac4-4fdd-8056-4852b57f7658)” with CreatePodSandboxError: "CreatePodSandbox for pod "coredns-5c98db65d4-bpw6m_kube-system(e9647c08-9ac4-4fdd-8056-4852b57f7658)" failed: rpc error: code = Unknown desc = failed to get network status for pod sandbox k8s_coredns-5c98db65d4-bpw6m_kube-system_e9647c08-9ac4-4fdd-8056-4852b57f7658_0(d3df2229acee6e98c1b9e11a4ebcdca3ba8b530ee6cf6b8b745503a00440f10f): Unexpected command output Device "eth0" does not exist.\n with error: exit status 1

and all k8s non-static pods (CoreDNS, Helm tiller) hanging in status ‘ContainerCreating’ (w/ ‘kubectl describe pod…’ reporting the same error as the kubelet)

ghglza on Jul 24, 2019

Thanks! But no, sorry – we already had set IPALLOC_RANGE in the Weave k8s deployment YAML to the pod CIDR as specified to kubeadm (please see sect. “Additional environment details” in the problem description above).

Ref. the other parameters in the command line you use, the k8s deployment starts the Weave containers w/

securityContext:
  privileged: true

so privilege elevation seems to be covered.

And I guess if we had one of the other parameters wrong, k8s would not come up at all.

As the problem also “survives” a complete teardown of the VirtualBox VMs, I don’t think that some configuration remnants under /var/lib/weave are the root cause.

ghglza on Jun 4, 2019