kubernetes: Flannel (NetworkPlugin cni) error: /run/flannel/subnet.env: no such file or directory

/kind bug

@kubernetes/sig-contributor-experience-bugs

What happened: Installed a single-node kubernetes cluster on centos 7 (VM running on virtual box); my application pod (created via k8s deployment) won’t go into Ready state

Pod Event: Warning FailedCreatePodSandBox . . . Kubelet . . . Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox . . . network for pod “companyemployees-deployment-766c7c7767-t7mc5”: NetworkPlugin cni failed to set up pod “companyemployees-deployment-766c7c7767-t7mc5_default” network: open /run/flannel/subnet.env: no such file or directory

In addition, it looks like the kubernetes coredns docker container keeps exiting – e.g. docker ps -a | grep -i coredns: 6341ce0be652 k8s.gcr.io/pause:3.1 “/pause” . . . Exited (0) 1 second ago k8s_POD_coredns-576cbf47c7-9bxxg_kube-system_e84afb7a-d7b7-11e8-bafa-08002745c4bc_581

What you expected to happen: Flannel not to have the error & Pod to go into ready state

How to reproduce it (as minimally and precisely as possible): Create a simple deployment after creating docker image and pushing the image to a private docker registry kubectl create -f companyemployees-deployment.yaml deployment yaml:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: companyemployees-deployment
  labels:
    app: companyemployees
spec:
  replicas: 1
  selector:
    matchLabels:
      app: companyemployees
  template:
    metadata:
      labels:
        app: companyemployees
    spec:
      containers:
      - name: companyemployees
        image: localhost:5000/companyemployees:1.0
        ports:
        - containerPort: 9092

Anything else we need to know?: ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000 link/ether 08:00:27:45:c4:bc brd ff:ff:ff:ff:ff:ff 3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000 link/ether 08:00:27:21:0f:92 brd ff:ff:ff:ff:ff:ff 4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT link/ether 02:42:1b:04:1f:7c brd ff:ff:ff:ff:ff:ff 6: veth3f5bcb4@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT link/ether b2:1f:d4:fb:84:2e brd ff:ff:ff:ff:ff:ff link-netnsid 0 7: flannel.1: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT link/ether e6:44:ed:15:dd:97 brd ff:ff:ff:ff:ff:ff

Environment:

Kubernetes version (use kubectl version): Client Version: version.Info{Major:“1”, Minor:“12”, GitVersion:“v1.12.1”, GitCommit:“4ed3216f3ec431b140b1d899130a69fc671678f4”, GitTreeState:“clean”, BuildDate:“2018-10-05T16:46:06Z”, GoVersion:“go1.10.4”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“12”, GitVersion:“v1.12.1”, GitCommit:“4ed3216f3ec431b140b1d899130a69fc671678f4”, GitTreeState:“clean”, BuildDate:“2018-10-05T16:36:14Z”, GoVersion:“go1.10.4”, Compiler:“gc”, Platform:“linux/amd64”}
Cloud provider or hardware configuration: Single-node kubernetes cluster on CentOS 7 VM running on virtual box (virtual box is running on windows 7 pro)
OS (e.g. from /etc/os-release): cat /etc/os-release: NAME=“CentOS Linux” VERSION=“7 (Core)” ID=“centos” ID_LIKE=“rhel fedora” VERSION_ID=“7” PRETTY_NAME=“CentOS Linux 7 (Core)” ANSI_COLOR=“0;31” CPE_NAME=“cpe:/o:centos:centos:7” HOME_URL=“https://www.centos.org/” BUG_REPORT_URL=“https://bugs.centos.org/”

CENTOS_MANTISBT_PROJECT=“CentOS-7” CENTOS_MANTISBT_PROJECT_VERSION=“7” REDHAT_SUPPORT_PRODUCT=“centos” REDHAT_SUPPORT_PRODUCT_VERSION=“7”

rpm -q centos-release centos-release-7-4.1708.el7.centos.x86_64

Kernel (e.g. uname -a): uname -a Linux ibm-ms 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Install tools:

My team’s centos image had docker, kubernetes, flannel and docker private registry already on the image; it was working and then recently I had issues w/ it that resulted in my uninstalling kubernetes, docker and flannel and reinstalling.

Install steps:

Switch to root: su - root

install docker

yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce
systemctl daemon-reload
systemctl enable docker
systemctl start docker
docker run hello-world

install private docker registry

docker pull registry
docker run -d -p 5000:5000 --restart=always --name registry registry
Note: firewalld is not running

install k8s:

setenforce 0
sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
swapoff -a
Edit /etc/fstab and comment-out /dev/mapper/centos-swap swap
Add kubernetes repo for yum - edit /etc/yum.repos.d/kubernetes.repo and add

[kubernetes]
	name=Kubernetes
	baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
	enabled=1
	gpgcheck=1
	repo_gpgcheck=1
	gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
		https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg

yum install -y kubelet kubeadm kubectl
systemctl enable kubelet
systemctl start kubelet
kubeadm init --pod-network-cidr= 10.244.0.0/16
k8s config for user – running as root: export KUBECONFIG=/etc/kubernetes/admin.conf

install flannel:

sysctl net.bridge.bridge-nf-call-iptables=1
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml

Remove master node taint (to allow scheduling pods on master): kubectl taint nodes --all node-role.kubernetes.io/master-

Others: Prior to installing, uninstalled using following steps:

Switch to root: su - root

Uninstall k8s (Although on master node, I did this a few times and included draining the node the last time)

kubectl drain mynodename --delete-local-data --force --ignore-daemonsets
kubectl delete node mynodename
kubeadm reset
systemctl stop kubelet
yum remove kubeadm kubectl kubelet kubernetes-cni kube*
yum autoremove
rm -rf ~/.kube
rm -rf /var/lib/kubelet/*

Uninstall docker:

docker rm docker ps -a -q``
docker stop (as needed)
docker rmi -f docker images -q``
Check that all containers and images were deleted: docker ps -a; docker images
systemctl stop docker
yum remove yum-utils device-mapper-persistent-data lvm2
yum remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-logrotate docker-logrotate docker-selinux docker-engine-selinux docker-engine
yum remove docker-ce
rm -rf /var/lib/docker 12. rm -rf /etc/docker

Uninstall flannel

rm -rf /var/lib/cni/
rm -rf /run/flannel
rm -rf /etc/cni/
Remove interfaces related to docker and flannel: ip link For each interface for docker or flannel, do the following ifconfig <name of interface from ip link> down ip link delete <name of interface from ip link>

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 2
Comments: 24 (3 by maintainers)

Most upvoted comments

Just got the same problem - fixed it by manually adding the file:

/run/flannel/subnet.env

FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true

+121

discostur on Apr 9, 2019

I know this is old but I wanted to comment here as I too had this issue, but in my case it was a symptom to a different issue. In my case, there was no subnet.env file but it was not getting created because my flannel daemonset was failing. The error from the pod (kubectl --namespace=kube-system logs <POD_NAME>) showed “Error registering network: failed to acquire lease: node “<NODE_NAME>” pod cidr not assigned”. The node was missing a spec for podCIDR, so I ran “kubectl patch node <NODE_NAME> -p ‘{“spec”:{“podCIDR”:“10.244.0.0/16”}}’” for each node and the issue went away.

+14

HankTheCrank on Nov 13, 2019

in my case, using centos in DO , the file /run/flannel/subnet.env exist, but same issue: /run/flannel/subnet.env: no such file or directory

at first I tried different subnet while running kubeadm init --pod-network-cidr=192.168.255.0/24

I tried @discostur solution, with changing the file manually, but the subnet.env restored to its original state when I restarted the master

this only solved by kubeadm reset and use flannel default network-cidr kubeadm init --pod-network-cidr=10.244.0.0/16

+12

suryastef on Apr 20, 2019

creating the /run/flannel/subnet.env fixes the coredns issue not starting but it’s only temporary. My solution for the master/control-plane :

kubeadm init --control-plane-endpoint=whatever --node-name whatever --pod-network-cidr=10.244.0.0/16
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
restart all

systemctl stop kubelet
systemctl stop docker
iptables --flush
iptables -tnat --flush
systemctl start kubelet
systemctl start docker

+10

audioscavenger on Feb 17, 2022

The subnet.env file is written out by the flannel daemonset pods and probably shouldn’t be modified by hand.

If that file isn’t getting written, it suggests another problem preventing the flannel pod from starting up. Are there other logs in the flannel pod? You can check with something like kubectl logs -n kube-system <flannel-pod-name>

Happy to continue discussing, but I’m going to close this since it appears to be a flannel issue rather than a Kubernetes one. Might also be worth raising as a support issue against the flannel repo too: https://github.com/coreos/flannel

/remove-triage unresolved /remove-kind bug /close

caseydavenport on May 2, 2019

Thanks, I just needed a quick solution for a test system running some old k8s. I scripted the workaround which recreates the missing /run/flannel/subnet.env:

#! /bin/bash

set -x


# See https://github.com/kubernetes/kubernetes/issues/70202
# Run as root (e.g. with sudo)

mkdir -p /run/flannel

cat << EOF > /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
EOF

attila123 on Oct 17, 2022

I also encountered exactly same problem while creating rook-ceph-operator pod, enforcing SELinux to 0 on worker nodes resolved the issue.

rprasad17088 on May 9, 2020

Just got the same problem - fixed it by manually adding the file: /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
this solution worked for me. but i’ve one doubt. What are the means of these values and how flannel is using these values?

This will get it started, but it won’t survive a reboot…still struggling with this myself

ryanjfrizzell on Apr 23, 2019

Just got the same problem - fixed it by manually adding the file:

/run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true

Thanks this worked for us

manukasa on Apr 22, 2019