kubeadm: coredns CrashLoopBackOff due to dnsmasq

What keywords did you search in kubeadm issues before filing this one?

Ubuntu 16.04 coredns crashloopbackoff

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

root@k8s-master:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-11-26T12:54:02Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):
root@k8s-master:~# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-11-26T12:57:14Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-11-26T12:46:57Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: Local Virtual Machine, 2 CPU, 4 GB RAM

  • OS (e.g. from /etc/os-release):

root@k8s-master:~# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.5 LTS
Release:	16.04
Codename:	xenial

  • Kernel (e.g. uname -a):
root@k8s-master:~# uname -a
Linux k8s-master 4.15.0-29-generic #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Others:
root@k8s-master:~# docker --version
Docker version 18.06.0-ce, build 0ffa825

root@k8s-master:~# sestatus
The program 'sestatus' is currently not installed. You can install it by typing:
apt install policycoreutils
root@k8s-master:~# kubectl -n kube-system get deployment coredns -o yaml | \
>   sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | \
>   kubectl apply -f -
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
deployment.extensions/coredns configured
root@k8s-master:~# grep nameserver /etc/resolv.conf 
nameserver 127.0.1.1
root@k8s-master:~# cat /run/systemd/resolve/resolv.conf
cat: /run/systemd/resolve/resolv.conf: No such file or directory
root@k8s-master:~# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS=--cgroup-driver=systemd --network-plugin=cni
root@k8s-master:~# systemctl list-unit-files | grep enabled | grep systemd-resolved
root@k8s-master:~# ps auxww | grep kubelet
root       501  3.3  2.6 496440 106152 ?       Ssl  07:09   0:41 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni
root@k8s-master:~# ufw disable
Firewall stopped and disabled on system startup

What happened?

root@k8s-master:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                 READY   STATUS             RESTARTS   AGE
kube-system   coredns-576cbf47c7-822v6             0/1     CrashLoopBackOff   11         24m
kube-system   coredns-576cbf47c7-n9tw9             0/1     CrashLoopBackOff   11         24m
kube-system   etcd-k8s-master                      1/1     Running            1          23m
kube-system   kube-apiserver-k8s-master            1/1     Running            1          23m
kube-system   kube-controller-manager-k8s-master   1/1     Running            1          23m
kube-system   kube-flannel-ds-amd64-qbff2          1/1     Running            1          20m
kube-system   kube-proxy-4bbbk                     1/1     Running            1          24m
kube-system   kube-scheduler-k8s-master            1/1     Running            1          23m

What you expected to happen?

I expected coredns pods to start properly

How to reproduce it (as minimally and precisely as possible)?

  1. Install CRI = Docker https://kubernetes.io/docs/setup/cri/#docker
  2. Install Kubeadm https://kubernetes.io/docs/setup/independent/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl
  3. Initialize kubernetes pod kubeadm init --pod-network-cidr=10.244.0.0/16
  4. Install Flanel https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network & scroll down to section “Installing a pod network add-on” & select tab “Flannel”
  5. Check Pods are in running state: root@k8s-master:~# kubectl get pods --all-namespaces

Anything else we need to know?

“Hack” solution mentioned in https://stackoverflow.com/a/53414041/5731350 works but i am not comfortable disabling something (loop) that is supposed to be working-as-intended.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 44 (11 by maintainers)

Most upvoted comments

see: coredns/coredns#2087 (comment)

Yes, this works as I noted in the “Anything else we need to know section” but it looks like its a hacky solution.

root@k8s-master:~# kubectl -n kube-system edit configmap coredns
<add a comment in the line containing 'loop' here, and save the file.
configmap/coredns edited
root@k8s-master:~#

root@k8s-master:~# kubectl -n kube-system delete pod -l k8s-app=kube-dns
pod "coredns-576cbf47c7-2zjm4" deleted
pod "coredns-576cbf47c7-854dn" deleted
root@k8s-master:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE
kube-system   coredns-576cbf47c7-7ls7n             1/1     Running   0          14s
kube-system   coredns-576cbf47c7-lvbnq             1/1     Running   0          14s
kube-system   etcd-k8s-master                      1/1     Running   0          20m
kube-system   kube-apiserver-k8s-master            1/1     Running   0          20m
kube-system   kube-controller-manager-k8s-master   1/1     Running   0          20m
kube-system   kube-flannel-ds-amd64-96724          1/1     Running   0          20m
kube-system   kube-proxy-4gq5w                     1/1     Running   0          21m
kube-system   kube-scheduler-k8s-master            1/1     Running   0          20m
root@k8s-master:~#

Hi @alejandrox1 , @neolit123 - i dont know whether its the combination of the new kubernetes v1.13 (when i raised this ticket i used v1.12 because that was the version available at that time) or me following the instructions (properly this time), but, on Ubuntu 16.04.05, installing latest kubernetes v1.13, following https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#tabs-pod-install-4 (i.e. Flannel) and then setting net.bridge.bridge-nf-call-iptables to 1, using the command sysctl net.bridge.bridge-nf-call-iptables=1 and then rebooting the machine/VM, to allow sysctl setting to ‘take hold’ & then installing Flannel…

No more crashloopback errors for coredns!!!

😃 😃

Thank you so much

OK, then it seems to be unrelated to this issue - i.e. off topic.

@aravind-murthy that’s interesting. The service is enabled in the network manager, but not running. Please, comment out ‘dns=dnsmasq’ line in the config and restart network manager: sudo systemctl restart network-manager

Then restore original name servers in /etc/resolv.conf (they’re probably commented out there). That should help core-dns pods when they’re restarted by kubelet.

disabling dnsmasq for network manager and commenting out dnsmasq nameservers, did the trick for me!

I’ve proposed a PR that generalizes the loop troubleshooting docs in the coredns loop plugin readme, so it more clearly applies to any kind of local DNS caching server, not just systemd-resolved. coredns/coredns#2363

crashloopbackoff-or-error-state solution does not work BTW.

sadly there can be more than one reason for a pods’ CrashLoopBackoff state.

Then restore original name servers in /etc/resolv.conf (they’re probably commented out there)

I havent commented anything out in this file (/etc/resolv.conf). It currently has following entries:

Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 10.211.55.1
nameserver 127.0.1.1
search localdomain

@aravind-murthy that’s interesting. The service is enabled in the network manager, but not running. Please, comment out ‘dns=dnsmasq’ line in the config and restart network manager: sudo systemctl restart network-manager

Then restore original name servers in /etc/resolv.conf (they’re probably commented out there). That should help core-dns pods when they’re restarted by kubelet.

once the container enters CrashLoopBackOff you can also call docker ps to see the running containers and then docker logs [coredns-container-id] to see the logs from the container itself.