calico: stat /var/lib/calico/nodename: no such file or directory problem,please help.
Hi, here is a problem in my kubernetes cluster, in the node wx3, I want to create a static pod named jenkins, but kubelet make error log over and over.
E0322 15:59:06.016063 1239 kuberuntime_gc.go:152] Failed to stop sandbox "420698bd9963f65496a5fd0c127f2b23497d678ddcf58362aa35615d8739d372" before removing: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "jenkins-wx3_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/ W0322 15:59:14.922384 1239 helpers.go:847] eviction manager: no observation found for eviction signal allocatableNodeFs.available I0322 15:59:17.649057 1239 kuberuntime_manager.go:389] No ready sandbox for pod "jenkins-wx3_default(1d947eff714cafbfcc78ef0291db3291)" can be found. Need to start a new one W0322 15:59:17.651466 1239 cni.go:265] CNI failed to retrieve network namespace path: Cannot find network namespace for the terminated container "aaf3954dc74a610b5da9cfbbcf67d413b64ee49f00d5df0835fb7f340449181b" E0322 15:59:17.756783 1239 cni.go:319] Error deleting network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/ E0322 15:59:17.757482 1239 remote_runtime.go:115] StopPodSandbox "aaf3954dc74a610b5da9cfbbcf67d413b64ee49f00d5df0835fb7f340449181b" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "jenkins-wx3_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/ E0322 15:59:17.757520 1239 kuberuntime_manager.go:781] Failed to stop sandbox {"docker" "aaf3954dc74a610b5da9cfbbcf67d413b64ee49f00d5df0835fb7f340449181b"} E0322 15:59:17.757568 1239 kuberuntime_manager.go:581] killPodWithSyncResult failed: failed to "KillPodSandbox" for "1d947eff714cafbfcc78ef0291db3291" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"jenkins-wx3_default\" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/" E0322 15:59:17.757597 1239 pod_workers.go:182] Error syncing pod 1d947eff714cafbfcc78ef0291db3291 ("jenkins-wx3_default(1d947eff714cafbfcc78ef0291db3291)"), skipping: failed to "KillPodSandbox" for "1d947eff714cafbfcc78ef0291db3291" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"jenkins-wx3_default\" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/"
when I put the jenkins.yml to wx1, everything ok. how can I fix it ?
Your Environment
~ # calicoctl version Client Version: v2.0.1 Build date: 2018-02-23T23:37:37+0000 Git commit: 5fa93655 Cluster Version: v3.0.1-218-gb3b47737 Cluster Type: k8s,bgp
~ # calicoctl get node -o wide
NAME ASN IPV4 IPV6
wx (unknown) 192.168.21.55/24
wx1 (unknown) 192.168.21.56/24
wx3 (unknown) 192.168.21.11/24
~ # calicoctl get workloadEndpoint -o wide
NAME WORKLOAD NODE NETWORKS INTERFACE PROFILES NATS
wx-k8s-dnsmasq–dep–844fb9f48d–wr4qp-eth0 dnsmasq-dep-844fb9f48d-wr4qp wx 172.50.56.6/32 cali3aeaee8bcfc kns.default
wx-k8s-nfsd–555cf7c46b–9q9q9-eth0 nfsd-555cf7c46b-9q9q9 wx 172.50.56.61/32 calie9a5b3f1744 kns.default
wx-k8s-nginx–deployment–77c45bd648–xb2r5-eth0 nginx-deployment-77c45bd648-xb2r5 wx 172.50.56.60/32 cali44402d20873 kns.default
wx-k8s-spark–master-eth0 spark-master wx 172.50.56.63/32 cali54d44e2d0ac kns.default
wx-k8s-spark–slave1-eth0 spark-slave1 wx 172.50.56.2/32 cali9a2eec147dd kns.default
wx-k8s-spark–slave2-eth0 spark-slave2 wx 172.50.56.1/32 cali80f72bad764 kns.default
wx-k8s-spark–slave3-eth0 spark-slave3 wx 172.50.56.5/32 caliac3052224a9 kns.default
wx-k8s-tomcat7–dep–74bf5b7d88–smq2n-eth0 tomcat7-dep-74bf5b7d88-smq2n wx 172.50.56.62/32 cali6c038e3b06b kns.default
wx-k8s-zk3–wx-eth0 zk3-wx wx 172.50.56.7/32 cali8f4bab72ef5 kns.default
wx1-k8s-busybox-eth0 busybox wx1 172.50.255.150/32 cali12d4a061371 kns.default
wx1-k8s-dnsmasq–dep–77bb7f589f–vzbb5-eth0 dnsmasq-dep-77bb7f589f-vzbb5 wx1 172.50.255.169/32 cali1c838e89bdd kns.default
wx1-k8s-hadoop–client-eth0 hadoop-client wx1 172.50.255.152/32 calid54dec8afc4 kns.default
wx1-k8s-hadoop–httpfs–8f757b8cc–qh8zm-eth0 hadoop-httpfs-8f757b8cc-qh8zm wx1 172.50.255.167/32 cali6994c0f1574 kns.default
wx1-k8s-hadoop–httpfs–8f757b8cc–rdt6c-eth0 hadoop-httpfs-8f757b8cc-rdt6c wx1 172.50.255.146/32 cali95554e22362 kns.default
wx1-k8s-nginx–deployment–77c45bd648–n598x-eth0 nginx-deployment-77c45bd648-n598x wx1 172.50.255.153/32 cali16e6132bd14 kns.default
wx1-k8s-nginx–deployment–77c45bd648–zv786-eth0 nginx-deployment-77c45bd648-zv786 wx1 172.50.255.159/32 calid24d442f2ea kns.default
wx1-k8s-tomcat7–dep–74bf5b7d88–4hpfr-eth0 tomcat7-dep-74bf5b7d88-4hpfr wx1 172.50.255.163/32 calib89ca8a389d kns.default
wx1-k8s-tomcat7–dep–74bf5b7d88–8sbjb-eth0 tomcat7-dep-74bf5b7d88-8sbjb wx1 172.50.255.149/32 cali98af15efd2b kns.default
wx1-k8s-tomcat7–dep–74bf5b7d88–9htnx-eth0 tomcat7-dep-74bf5b7d88-9htnx wx1 172.50.255.151/32 cali893197594b5 kns.default
wx1-k8s-tomcat7–dep–74bf5b7d88–qcn9f-eth0 tomcat7-dep-74bf5b7d88-qcn9f wx1 172.50.255.162/32 cali93dfdd66d35 kns.default
wx1-k8s-zk2–wx1-eth0 zk2-wx1 wx1 172.50.255.157/32 cali36493d30616 kns.default
ubuntu@ubuntu1:~$ sudo kubectl describe po jenkins-wx3
Name: jenkins-wx3
Namespace: default
Node: wx3/192.168.21.11
Start Time: Thu, 22 Mar 2018 15:42:03 +0800
Labels: app=jenkins
Annotations: kubernetes.io/config.hash=1d947eff714cafbfcc78ef0291db3291
kubernetes.io/config.mirror=1d947eff714cafbfcc78ef0291db3291
kubernetes.io/config.seen=2018-03-22T15:42:03.107778114+08:00
kubernetes.io/config.source=file
Status: Pending
IP:
Containers:
jenkins:
Container ID:
Image: jenkins:alpine
Image ID:
Ports: 8080/TCP, 50000/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes: <none>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: :NoExecute
Events: <none>
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 1
- Comments: 16 (9 by maintainers)
Commits related to this issue
- mount calico dirs in kubelet allow sharing of dirs with cni: https://github.com/projectcalico/calico/issues/1795#issuecomment-390171235 — committed to utilitywarehouse/tf_kube_ignition by george-angel 6 years ago
- mount calico dirs in kubelet allow sharing of dirs with cni: https://github.com/projectcalico/calico/issues/1795#issuecomment-390171235 — committed to utilitywarehouse/tf_kube_ignition by george-angel 6 years ago
- mount calico dirs in kubelet allow sharing of dirs with cni: https://github.com/projectcalico/calico/issues/1795#issuecomment-390171235 — committed to utilitywarehouse/tf_kube_ignition by george-angel 6 years ago
- mount calico dirs in kubelet allow sharing of dirs with cni: https://github.com/projectcalico/calico/issues/1795#issuecomment-390171235 Requirement for switching to calico only networking. Should no... — committed to utilitywarehouse/tf_kube_ignition by george-angel 6 years ago
- Mount calico dirs in canal and kubelet Fixes pod init issue similar to https://github.com/projectcalico/calico/issues/1795#issuecomment-390171235: ``` Failed create pod sandbox: rpc error: code = Unk... — committed to cknowles/kube-aws by c-knowles 6 years ago
- Self hosted calico v3.1.3 (#1397) - Bump calico images for self hosted Calico - Include latest CRDs and RBAC for canal from https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installatio... — committed to kubernetes-retired/kube-aws by c-knowles 6 years ago
@r7vme
Facing same issue.
My calico.yml file is https://docs.projectcalico.org/v3.5/getting-started/kubernetes/installation/hosted/calico.yaml
Error
Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container “b577ddbdd5fbd6cbe79e5b1bf20648e981590ecd0df545a0158ce909d9179096” network for pod “frontend-784f75ddb7-nbz7t”: NetworkPlugin cni failed to set up pod “frontend-784f75ddb7-nbz7t_default” network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
kubectl get pods --all-namespaces
Kubercates version v1.13
We run kubelet in docker container, so i need to provide access to
/var/lib/calico
host path. It isn’t easy not from config change perspective, but from perspective of releasing two dependant changes. I need to make sure all our customers updated to release with mount, before i can apply new calico. All doable, butnodename_file_optional
makes it possible to release new calico in single step. We already discussed changes and it’s completely safe procedure, because nodename will be fetched by callinghostname
only when master already upgraded (applied new calico manifest), but worker still not. When worker will be rolled out with kubelet change (mount/var/lib/calico
) CNI immediately will start using/var/lib/calico/nodename
file. In total it’s about 1 hour from our experience. Bam! 😃@ggaurav10 do you see the
/var/lib/calico/nodename
file on the host filesystem?Also, are you running a containerized kubelet by chance? If so, you’ll also need to mount that directory into the kubelet container so that the CNI plugin can see it.
Just to clarify this -
master
is the latest build of the code from the master branch, and isn’t guaranteed to be stable.latest
points to the latest stable release.I’d still recommend pinning to a specific release to avoid pulling in unexpected changes.
Whoever will be struggling with the same error and it’s not always quick to upgrade kubelet config (add
/var/lib/calico
mount) on all clusters. There is a compatibility mode if calico nodename == hostname.Add to configmap
So final
cni_network_config
looks like that:In this case, for nodes w/o
/var/lib/calico
in kubelet CNI plugin will use hostname, for nodes with mount it will use/var/lib/calico/nodename
file.thanks for the response. yes. i can see the file on the host, and yes, the kubelet is running in a container. Mounting the directory in the kubelet container solved the issue. 😃
Thanks again.