kubeadm: coredns CrashLoopBackOff due to dnsmasq

What keywords did you search in kubeadm issues before filing this one?

Ubuntu 16.04 coredns crashloopbackoff

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

root@k8s-master:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-11-26T12:54:02Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Kubernetes version (use kubectl version):

root@k8s-master:~# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-11-26T12:57:14Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-11-26T12:46:57Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration: Local Virtual Machine, 2 CPU, 4 GB RAM
OS (e.g. from /etc/os-release):

root@k8s-master:~# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.5 LTS
Release:	16.04
Codename:	xenial

Kernel (e.g. uname -a):

root@k8s-master:~# uname -a
Linux k8s-master 4.15.0-29-generic #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Others:

root@k8s-master:~# docker --version
Docker version 18.06.0-ce, build 0ffa825

root@k8s-master:~# sestatus
The program 'sestatus' is currently not installed. You can install it by typing:
apt install policycoreutils

root@k8s-master:~# kubectl -n kube-system get deployment coredns -o yaml | \
>   sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | \
>   kubectl apply -f -
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
deployment.extensions/coredns configured

root@k8s-master:~# grep nameserver /etc/resolv.conf 
nameserver 127.0.1.1

root@k8s-master:~# cat /run/systemd/resolve/resolv.conf
cat: /run/systemd/resolve/resolv.conf: No such file or directory

root@k8s-master:~# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS=--cgroup-driver=systemd --network-plugin=cni

root@k8s-master:~# systemctl list-unit-files | grep enabled | grep systemd-resolved

root@k8s-master:~# ps auxww | grep kubelet
root       501  3.3  2.6 496440 106152 ?       Ssl  07:09   0:41 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni

root@k8s-master:~# ufw disable
Firewall stopped and disabled on system startup

What happened?

root@k8s-master:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                 READY   STATUS             RESTARTS   AGE
kube-system   coredns-576cbf47c7-822v6             0/1     CrashLoopBackOff   11         24m
kube-system   coredns-576cbf47c7-n9tw9             0/1     CrashLoopBackOff   11         24m
kube-system   etcd-k8s-master                      1/1     Running            1          23m
kube-system   kube-apiserver-k8s-master            1/1     Running            1          23m
kube-system   kube-controller-manager-k8s-master   1/1     Running            1          23m
kube-system   kube-flannel-ds-amd64-qbff2          1/1     Running            1          20m
kube-system   kube-proxy-4bbbk                     1/1     Running            1          24m
kube-system   kube-scheduler-k8s-master            1/1     Running            1          23m

What you expected to happen?

I expected coredns pods to start properly

How to reproduce it (as minimally and precisely as possible)?

Install CRI = Docker https://kubernetes.io/docs/setup/cri/#docker
Install Kubeadm https://kubernetes.io/docs/setup/independent/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl
Initialize kubernetes pod kubeadm init --pod-network-cidr=10.244.0.0/16
Install Flanel https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network & scroll down to section “Installing a pod network add-on” & select tab “Flannel”
Check Pods are in running state: root@k8s-master:~# kubectl get pods --all-namespaces

Anything else we need to know?

“Hack” solution mentioned in https://stackoverflow.com/a/53414041/5731350 works but i am not comfortable disabling something (loop) that is supposed to be working-as-intended.

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 44 (11 by maintainers)

Links to this issue

Most upvoted comments

see: coredns/coredns#2087 (comment)

Yes, this works as I noted in the “Anything else we need to know section” but it looks like its a hacky solution.

root@k8s-master:~# kubectl -n kube-system edit configmap coredns
<add a comment in the line containing 'loop' here, and save the file.
configmap/coredns edited
root@k8s-master:~#

root@k8s-master:~# kubectl -n kube-system delete pod -l k8s-app=kube-dns
pod "coredns-576cbf47c7-2zjm4" deleted
pod "coredns-576cbf47c7-854dn" deleted
root@k8s-master:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE
kube-system   coredns-576cbf47c7-7ls7n             1/1     Running   0          14s
kube-system   coredns-576cbf47c7-lvbnq             1/1     Running   0          14s
kube-system   etcd-k8s-master                      1/1     Running   0          20m
kube-system   kube-apiserver-k8s-master            1/1     Running   0          20m
kube-system   kube-controller-manager-k8s-master   1/1     Running   0          20m
kube-system   kube-flannel-ds-amd64-96724          1/1     Running   0          20m
kube-system   kube-proxy-4gq5w                     1/1     Running   0          21m
kube-system   kube-scheduler-k8s-master            1/1     Running   0          20m
root@k8s-master:~#

aravind-craft on Dec 3, 2018

Hi @alejandrox1 , @neolit123 - i dont know whether its the combination of the new kubernetes v1.13 (when i raised this ticket i used v1.12 because that was the version available at that time) or me following the instructions (properly this time), but, on Ubuntu 16.04.05, installing latest kubernetes v1.13, following https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#tabs-pod-install-4 (i.e. Flannel) and then setting net.bridge.bridge-nf-call-iptables to 1, using the command sysctl net.bridge.bridge-nf-call-iptables=1 and then rebooting the machine/VM, to allow sysctl setting to ‘take hold’ & then installing Flannel…

No more crashloopback errors for coredns!!!

😃 😃

Thank you so much

aravind-craft on Dec 12, 2018

OK, then it seems to be unrelated to this issue - i.e. off topic.

chrisohaver on Jun 17, 2019

@aravind-murthy that’s interesting. The service is enabled in the network manager, but not running. Please, comment out ‘dns=dnsmasq’ line in the config and restart network manager: sudo systemctl restart network-manager

Then restore original name servers in /etc/resolv.conf (they’re probably commented out there). That should help core-dns pods when they’re restarted by kubelet.

disabling dnsmasq for network manager and commenting out dnsmasq nameservers, did the trick for me!

dimmg on Jan 28, 2019

I’ve proposed a PR that generalizes the loop troubleshooting docs in the coredns loop plugin readme, so it more clearly applies to any kind of local DNS caching server, not just systemd-resolved. coredns/coredns#2363

chrisohaver on Dec 3, 2018

crashloopbackoff-or-error-state solution does not work BTW.

sadly there can be more than one reason for a pods’ CrashLoopBackoff state.

neolit123 on Dec 3, 2018

Then restore original name servers in /etc/resolv.conf (they’re probably commented out there)

I havent commented anything out in this file (/etc/resolv.conf). It currently has following entries:

Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 10.211.55.1
nameserver 127.0.1.1
search localdomain

aravind-craft on Dec 3, 2018

@aravind-murthy that’s interesting. The service is enabled in the network manager, but not running. Please, comment out ‘dns=dnsmasq’ line in the config and restart network manager: sudo systemctl restart network-manager

Then restore original name servers in /etc/resolv.conf (they’re probably commented out there). That should help core-dns pods when they’re restarted by kubelet.

bart0sh on Dec 3, 2018

once the container enters CrashLoopBackOff you can also call docker ps to see the running containers and then docker logs [coredns-container-id] to see the logs from the container itself.

neolit123 on Dec 3, 2018