kind: coredns CrashLoopBackOff on Ubuntu 20.04
What happened:
❯ k get pods
NAME READY STATUS RESTARTS AGE
coredns-66bff467f8-j5cf4 0/1 CrashLoopBackOff 1 54s
coredns-66bff467f8-l6gtz 0/1 CrashLoopBackOff 1 54s
etcd-retrogames-k8s-dev-control-plane 1/1 Running 0 66s
kindnet-wgxw8 1/1 Running 0 54s
kube-apiserver-retrogames-k8s-dev-control-plane 1/1 Running 0 66s
kube-controller-manager-retrogames-k8s-dev-control-plane 1/1 Running 0 66s
kube-proxy-nnkwz 1/1 Running 0 54s
kube-scheduler-retrogames-k8s-dev-control-plane 1/1 Running 0 66s
What you expected to happen:
Coredns should work.
How to reproduce it (as minimally and precisely as possible):
❯ cat /etc/issue
Ubuntu 20.04.1 LTS \n \l
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:v1.19.1@sha256:98cf5288864662e37115e362b23e4369c8c4a408f99cbc06e58ac30ddc721600
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
❯ docker exec -it retrogames-k8s-dev-control-plane cat /etc/resolv.conf
search homenet.telecomitalia.it
nameserver 127.0.0.1
options ndots:0
Environment:
- kind version:
kind v0.9.0 go1.15.2 linux/amd64
- Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:17:17Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-14T07:30:52Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
- Docker version:
❯ docker info
Client:
Debug Mode: false
Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 2
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem: <unknown>
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version:
runc version:
init version:
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-56-generic
Operating System: Ubuntu 20.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.33GiB
Name: spark-carbon-cto
ID: 2VUG:W4M7:ONOJ:GABB:KWWA:KILS:KLAA:RJLE:MOCY:YGB2:L6H6:VYP3
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
- OS:
❯ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 30 (21 by maintainers)
Good news: I just tried building the image and creating the cluster with it and it works, no more CoreDNS crashes 🎉
that is the base image used to build the node image, you need to build a new node image, if you see the command help
you can specify this base image, or if you use kind from master it will use it by default
The bot closes it automatically 😃 , I can’t reproduce the issue and I could only test that if the ip is a loopback it uses the default network, but it will nice if you can confirm that there are no more hidden issues because of this behavior. We can always reopen, so I will reopen and wait for your confirmation
Whoah interesting. Thanks. That is not expected, on linux we naively expect this to not resolve because it’s something docker desktop app sets up. We will clearly need to rethink that bit.
I have the same issue when creating a new cluster on Ubuntu
20.10
:Attached here you can find the logs. While trying to understand why
127.0.0.1
is written inside theresolv.conf
file in the container, I noticed that when running/usr/local/bin/entrypoint
script, the commandgetent ahostsv4 'host.docker.internal' | head -n1 | cut -d' ' -f1
returns127.0.0.1
and that’s what gets written in the file@BenTheElder Hi, thanks for looking into it.
I have created a new cluster from scratch on my system and issue is still the same. CoreDNS configuration MD5 is the same.
Kind export can be downloaded from here: https://cloud.juusujanar.eu/index.php/s/CDGadQRcgm73t2a
EDIT: I tested same installation method on a VM running Debian 10 (kernel 4.19.0-14-amd64) with Docker 20.10.3 and CoreDNS started okay there.
I did not modify the images, just a clean install right now.
EDIT2: Disabled systemd-resolved and resorted to NetworkManager-handled DNS (host /etc/resolv.conf config below), then CoreDNS started just fine. Found this resource, that says loops happen when host runs a local DNS cache: https://github.com/coredns/coredns/blob/master/plugin/loop/README.md
CoreDNS has the following resolv.conf file:
Hi @BenTheElder sure, i guess it’s something related to the network where i am attached, because in another network now is working fine, i have to double check it and i’ll come back here.