kind: coredns CrashLoopBackOff on Ubuntu 20.04

What happened:

❯ k get pods
NAME                                                       READY   STATUS             RESTARTS   AGE
coredns-66bff467f8-j5cf4                                   0/1     CrashLoopBackOff   1          54s
coredns-66bff467f8-l6gtz                                   0/1     CrashLoopBackOff   1          54s
etcd-retrogames-k8s-dev-control-plane                      1/1     Running            0          66s
kindnet-wgxw8                                              1/1     Running            0          54s
kube-apiserver-retrogames-k8s-dev-control-plane            1/1     Running            0          66s
kube-controller-manager-retrogames-k8s-dev-control-plane   1/1     Running            0          66s
kube-proxy-nnkwz                                           1/1     Running            0          54s
kube-scheduler-retrogames-k8s-dev-control-plane            1/1     Running            0          66s

What you expected to happen:

Coredns should work.

How to reproduce it (as minimally and precisely as possible):

❯ cat /etc/issue
Ubuntu 20.04.1 LTS \n \l

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:v1.19.1@sha256:98cf5288864662e37115e362b23e4369c8c4a408f99cbc06e58ac30ddc721600
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        node-labels: "ingress-ready=true"
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
  - containerPort: 443
    hostPort: 443
    protocol: TCP

❯ docker exec -it retrogames-k8s-dev-control-plane cat /etc/resolv.conf
search homenet.telecomitalia.it
nameserver 127.0.0.1
options ndots:0

Environment:

kind version: kind v0.9.0 go1.15.2 linux/amd64
Kubernetes version:

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:17:17Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-14T07:30:52Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

Docker version:

❯ docker info
Client:
 Debug Mode: false

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 2
 Server Version: 19.03.8
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 
 runc version: 
 init version: 
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-56-generic
 Operating System: Ubuntu 20.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.33GiB
 Name: spark-carbon-cto
 ID: 2VUG:W4M7:ONOJ:GABB:KWWA:KILS:KLAA:RJLE:MOCY:YGB2:L6H6:VYP3
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

❯ cat /etc/os-release  
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 30 (21 by maintainers)

Most upvoted comments

Good news: I just tried building the image and creating the cluster with it and it works, no more CoreDNS crashes 🎉

ste93cry on Mar 29, 2021

I see you bumped the node image to kindest/base:v20210328-c17ca167@sha256:0311870f4d35b0f68e2fedb5d703a552a8e7eb438acc67a3bd13982c2bda7487, can I use that for the test or do I still need to build it myself?

that is the base image used to build the node image, you need to build a new node image, if you see the command help

$ kind build node-image -h
Build the node image which contains Kubernetes build artifacts and other kind requirements

Usage:
  kind build node-image [flags]

Flags:
      --base-image string   name:tag of the base image to use for the build (default "kindest/base:v20210205-c0cffc8c")

you can specify this base image, or if you use kind from master it will use it by default

aojea on Mar 28, 2021

Sorry if I arrive late, I was waiting tomorrow to test it out on the PC having the issue. However, as I see that PR got already merged, I assume there is no more need of testing the Kind image

The bot closes it automatically 😃 , I can’t reproduce the issue and I could only test that if the ip is a loopback it uses the default network, but it will nice if you can confirm that there are no more hidden issues because of this behavior. We can always reopen, so I will reopen and wait for your confirmation

aojea on Mar 28, 2021

Whoah interesting. Thanks. That is not expected, on linux we naively expect this to not resolve because it’s something docker desktop app sets up. We will clearly need to rethink that bit.

BenTheElder on Mar 26, 2021

I have the same issue when creating a new cluster on Ubuntu 20.10:

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.10 (Groovy Gorilla)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.10"
VERSION_ID="20.10"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=groovy
UBUNTU_CODENAME=groovy

$ kind version
kind v0.10.0 go1.15.7 linux/amd64`

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-20T02:22:41Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-21T01:11:42Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
 Containers: 59
  Running: 47
  Paused: 0
  Stopped: 12
 Images: 126
 Server Version: 20.10.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.8.0-48-generic
 Operating System: Ubuntu 20.10
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 31GiB
 Name: LAPTOP-STEFANO
 ID: Z55C:5DBZ:QVDX:5MJ5:XNWD:HQCN:ALK5:QMD6:RRKM:S75L:UTVK:XTK5
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: stefanoarlandini
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

$ docker exec -it kind-control-plane cat /etc/resolv.conf
search homenet.telecomitalia.it
nameserver 127.0.0.1
nameserver 2001:4860:4860::8888
nameserver 2001:4860:4860::8844
options edns0 trust-ad ndots:0

$ docker inspect kind-control-plane
[
    {
        "Id": "f18aaef3b7c1ea39565d2f2efe960ecb7e6cf25ef9f9779d6968cb2302022c3d",
        "Created": "2021-03-26T16:35:05.321794727Z",
        "Path": "/usr/local/bin/entrypoint",
        "Args": [
            "/sbin/init"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 99695,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2021-03-26T16:35:07.590680832Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:094599011731a3d022da48d9d83fdcf7c3113cd3bf3d7261265e2cad222e7263",
        "ResolvConfPath": "/var/lib/docker/containers/f18aaef3b7c1ea39565d2f2efe960ecb7e6cf25ef9f9779d6968cb2302022c3d/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/f18aaef3b7c1ea39565d2f2efe960ecb7e6cf25ef9f9779d6968cb2302022c3d/hostname",
        "HostsPath": "/var/lib/docker/containers/f18aaef3b7c1ea39565d2f2efe960ecb7e6cf25ef9f9779d6968cb2302022c3d/hosts",
        "LogPath": "/var/lib/docker/containers/f18aaef3b7c1ea39565d2f2efe960ecb7e6cf25ef9f9779d6968cb2302022c3d/f18aaef3b7c1ea39565d2f2efe960ecb7e6cf25ef9f9779d6968cb2302022c3d-json.log",
        "Name": "/kind-control-plane",
        "RestartCount": 0,
        "Driver": "overlay2",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "unconfined",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/lib/modules:/lib/modules:ro"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "kind",
            "PortBindings": {
                "6443/tcp": [
                    {
                        "HostIp": "127.0.0.1",
                        "HostPort": "46315"
                    }
                ]
            },
            "RestartPolicy": {
                "Name": "on-failure",
                "MaximumRetryCount": 1
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "CgroupnsMode": "host",
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": true,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [
                "seccomp=unconfined",
                "apparmor=unconfined",
                "label=disable"
            ],
            "Tmpfs": {
                "/run": "",
                "/tmp": ""
            },
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": [],
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": false,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": null,
            "ReadonlyPaths": null
        },
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/8aa2d8dc5292a1421adc30493d3c80e5e666f069e7e159f3ba0c551798d85c3e-init/diff:/var/lib/docker/overlay2/15b9ee657a5a03fe812a6f377355d9906034bd551ae6cb1cb8ee7191f31fcacf/diff:/var/lib/docker/overlay2/8c78be55c8cc33b9f7b65302d282dad6e59071192fbc6ccf0672b7645bd47a61/diff:/var/lib/docker/overlay2/d30ca99ca48670794e02333b23c69f446630fd82bf448f64424a58bbd416a0ff/diff:/var/lib/docker/overlay2/ae0c77d99e6c163ff78d7af24d13e6aa8b1e248dda19b5bc1403a7f01a3d064f/diff:/var/lib/docker/overlay2/1ee6fbcb49586758ca8f81672642e73963d87908219295e32bad1e95f6c6948c/diff:/var/lib/docker/overlay2/0d22adf13c16419fe79de35ac76a067cef94e9b31fcc622331d3a86192c44fb1/diff",
                "MergedDir": "/var/lib/docker/overlay2/8aa2d8dc5292a1421adc30493d3c80e5e666f069e7e159f3ba0c551798d85c3e/merged",
                "UpperDir": "/var/lib/docker/overlay2/8aa2d8dc5292a1421adc30493d3c80e5e666f069e7e159f3ba0c551798d85c3e/diff",
                "WorkDir": "/var/lib/docker/overlay2/8aa2d8dc5292a1421adc30493d3c80e5e666f069e7e159f3ba0c551798d85c3e/work"
            },
            "Name": "overlay2"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/lib/modules",
                "Destination": "/lib/modules",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Type": "volume",
                "Name": "aea6b97a38cfb569882c80d32da91e8bc0827c39b39d9818fb1de2e2990adaa1",
                "Source": "/var/lib/docker/volumes/aea6b97a38cfb569882c80d32da91e8bc0827c39b39d9818fb1de2e2990adaa1/_data",
                "Destination": "/var",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            }
        ],
        "Config": {
            "Hostname": "kind-control-plane",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "6443/tcp": {}
            },
            "Tty": true,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "container=docker"
            ],
            "Cmd": null,
            "Image": "kindest/node:v1.20.2@sha256:8f7ea6e7642c0da54f04a7ee10431549c0257315b3a634f6ef2fecaaedb19bab",
            "Volumes": {
                "/var": {}
            },
            "WorkingDir": "",
            "Entrypoint": [
                "/usr/local/bin/entrypoint",
                "/sbin/init"
            ],
            "OnBuild": null,
            "Labels": {
                "io.x-k8s.kind.cluster": "kind",
                "io.x-k8s.kind.role": "control-plane"
            },
            "StopSignal": "SIGRTMIN+3"
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "c7bc1ad18095fa077ab2dd5fc8a635ad6a6dde4bc05ba20f94b01eea0886fcd6",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {
                "6443/tcp": [
                    {
                        "HostIp": "127.0.0.1",
                        "HostPort": "46315"
                    }
                ]
            },
            "SandboxKey": "/var/run/docker/netns/c7bc1ad18095",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "kind": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": [
                        "f18aaef3b7c1",
                        "kind-control-plane"
                    ],
                    "NetworkID": "c5459c606fdf1a2c071255b83e49767e7da8405df2019005f55b0f0c4bb5d719",
                    "EndpointID": "2bd2e0e56207a951f8b60ed6a42392f71643c4b2a91458d82f7052d4c6e0a801",
                    "Gateway": "172.19.0.1",
                    "IPAddress": "172.19.0.2",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "fc00:f853:ccd:e793::1",
                    "GlobalIPv6Address": "fc00:f853:ccd:e793::2",
                    "GlobalIPv6PrefixLen": 64,
                    "MacAddress": "02:42:ac:13:00:02",
                    "DriverOpts": null
                }
            }
        }
    }
]

Attached here you can find the logs. While trying to understand why 127.0.0.1 is written inside the resolv.conf file in the container, I noticed that when running /usr/local/bin/entrypoint script, the command getent ahostsv4 'host.docker.internal' | head -n1 | cut -d' ' -f1 returns 127.0.0.1 and that’s what gets written in the file

ste93cry on Mar 26, 2021

@BenTheElder Hi, thanks for looking into it.

I have created a new cluster from scratch on my system and issue is still the same. CoreDNS configuration MD5 is the same.

Kind export can be downloaded from here: https://cloud.juusujanar.eu/index.php/s/CDGadQRcgm73t2a

➜  ~ GO111MODULE="on" go get sigs.k8s.io/kind@v0.10.0 && kind create cluster
go: downloading sigs.k8s.io/kind v0.10.0
go: downloading k8s.io/apimachinery v0.19.2
go: downloading golang.org/x/sys v0.0.0-20200928205150-006507a75852
go: downloading gopkg.in/yaml.v2 v2.2.8
go: downloading github.com/pelletier/go-toml v1.8.1
go: downloading github.com/evanphx/json-patch v4.9.0+incompatible
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.20.2) 🖼 
 ✓ Preparing nodes 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Not sure what to do next? 😅  Check out https://kind.sigs.k8s.io/docs/user/quick-start/
➜  ~ docker ps
CONTAINER ID   IMAGE                              COMMAND                  CREATED              STATUS                 PORTS                       NAMES
654d3f0cb42a   kindest/node:v1.20.2               "/usr/local/bin/entr…"   About a minute ago   Up About a minute      127.0.0.1:36157->6443/tcp   kind-control-plane
➜  ~ kubectl cluster-info --context kind-kind
Kubernetes master is running at https://127.0.0.1:36157
KubeDNS is running at https://127.0.0.1:36157/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
➜  ~ kubectl --context kind-kind get pods -A
NAMESPACE            NAME                                         READY   STATUS             RESTARTS   AGE
kube-system          coredns-74ff55c5b-t4v5k                      0/1     CrashLoopBackOff   1          39s
kube-system          coredns-74ff55c5b-wdvgf                      0/1     CrashLoopBackOff   1          39s
kube-system          etcd-kind-control-plane                      0/1     Running            0          47s
kube-system          kindnet-d9jvw                                1/1     Running            0          40s
kube-system          kube-apiserver-kind-control-plane            1/1     Running            0          47s
kube-system          kube-controller-manager-kind-control-plane   0/1     Running            0          47s
kube-system          kube-proxy-nxv2r                             1/1     Running            0          40s
kube-system          kube-scheduler-kind-control-plane            0/1     Running            0          47s
local-path-storage   local-path-provisioner-78776bfc44-dp24j      1/1     Running            0          39s

EDIT: I tested same installation method on a VM running Debian 10 (kernel 4.19.0-14-amd64) with Docker 20.10.3 and CoreDNS started okay there.

I did not modify the images, just a clean install right now.

EDIT2: Disabled systemd-resolved and resorted to NetworkManager-handled DNS (host /etc/resolv.conf config below), then CoreDNS started just fine. Found this resource, that says loops happen when host runs a local DNS cache: https://github.com/coredns/coredns/blob/master/plugin/loop/README.md

# Generated by NetworkManager
nameserver 1.1.1.1
nameserver 1.0.0.1

CoreDNS has the following resolv.conf file:

nameserver 172.18.0.1
options ndots:0

juusujanar on Mar 3, 2021

Hi @BenTheElder sure, i guess it’s something related to the network where i am attached, because in another network now is working fine, i have to double check it and i’ll come back here.

paolomainardi on Dec 13, 2020