kubernetes: Failed to generate pod sandbox config if missing /etc/resolv.conf

What happened:

I run kubeadm init on a node and it took a long time to wait for the kubelet(to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”). And It timeout eventually, so I checked the journal of kubelet and found those error:

Jun 23 18:33:35 10-6-150-51 kubelet[32141]: E0623 18:33:35.148371 32141 kuberuntime_sandbox.go:41] “Failed to generate sandbox config for pod” err=“open /etc/resolv.conf: no such file or directory” pod=“kube-system/kube-scheduler-10-6-150-51” Jun 23 18:33:35 10-6-150-51 kubelet[32141]: E0623 18:33:35.148415 32141 kuberuntime_manager.go:790] “CreatePodSandbox for pod failed” err=“open /etc/resolv.conf: no such file or directory” pod=“kube-system/kube-scheduler-10-6-150-51” Jun 23 18:33:35 10-6-150-51 kubelet[32141]: E0623 18:33:35.148499 32141 pod_workers.go:190] “Error syncing pod, skipping” err=“failed to "CreatePodSandbox" for "kube-scheduler-10-6-150-51_kube-system(39f9fcbd5659a4af7687f0af6d8320f0)" with CreatePodSandboxError: "Failed to generate sandbox config for pod \"kube-scheduler-10-6-150-51_kube-system(39f9fcbd5659a4af7687f0af6d8320f0)\": open /etc/resolv.conf: no such file or directory"” pod=“kube-system/kube-scheduler-10-6-150-51” podUID=39f9fcbd5659a4af7687f0af6d8320f0

Looks like the same problem as https://github.com/moby/moby/issues/4861

What you expected to happen:

Pod can create successfully because /etc/resolv.conf is not an essential file in filesystem. (see http://man7.org/linux/man-pages/man5/resolv.conf.5.html)

If this file does not exist, only the name server on the local machine will be queried

How to reproduce it (as minimally and precisely as possible):

  1. mv /etc/resolv.conf somewhere else
  2. run “kubeadm init” and watch kubelet journal

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): Kubernetes v1.21.1
  • OS (e.g: cat /etc/os-release): CentOS Linux release 7.9

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 27 (24 by maintainers)

Most upvoted comments

if you need to distinguish between unset and set-but-empty, the config file needs to use a pointer, and the defaulting needs to only default on nil, not “”

what if the resolv.conf is added later? this can race … or you have some pods with a dummy resolv.con and others with a good one

resolv.conf file that pods in a node use can be changed somehow. Because it is loaded when creating the pod. This issue is not relevant to the race itself.

I think the main issue here is that whether we want kubelet to support the absence of resolv.conf or not. I am not claiming that it is necessary. However it seems that the case without resolv.conf is also a valid configuration.

If this file does not exist, only the name server on the local
machine will be queried, and the search list contains the local
domain name determined from the hostname.

discussion was going on the PR #103183 (comment), it should be triaged first though, is not clear this is a bug or a new feature Kubernetes want to support …

I only met similar issues in some private clouds or bare metal private machines. It would be a problem for them. IMO, this may not be a bug, but something we can tolerate.

There are three ways:

  1. like PR Kubelet use default DNS configuration if host’s resolvConf file does not exist #103183, return an empty resolv.conf file. (I prefer this because in some cases the /etc.resolv.conf has no nameservers and the pod can start. File-not-exists is the same to me as no nameservers in the file.)

what if the resolv.conf is added later? this can race … or you have some pods with a dummy resolv.con and others with a good one

  1. we can do some pre-check in installers or kubelet start-up.

if the resolv.conf is needed/required for a cluster to work well, that I think it is, I think this is a good option

  1. this is expected behavior.

at least is the one that we always had 😄 😄

Yes, it looks ok to me, if there is no better suggestion.

@liggitt we have one incongruence between config and flags in kubelet, can you advise? https://github.com/kubernetes/kubernetes/issues/103110#issuecomment-897613618

I guess you’ve mentioned you went through this pain before

I think the main issue here is that whether we want kubelet to support the absence of resolv.conf or not. I am not claiming that it is necessary. However it seems that the case without resolv.conf is also a valid configuration.

Good point @gjkim42, that is really the discussion.

I was checking the code and there is something important here we were not considering, and is that the resolv.conf path is a configuration option that defaults to /etc/resolv.conf

https://github.com/kubernetes/kubernetes/blob/27b02a3e37f6ae5c121e0e412520199b50fb8290/staging/src/k8s.io/kubelet/config/v1beta1/types.go#L596-L604

https://github.com/kubernetes/kubernetes/blob/27b02a3e37f6ae5c121e0e412520199b50fb8290/cmd/kubelet/app/options/options.go#L506-L507

I personally, don’t think that if I (as admin) configure the resolv.conf with a wrong path (/tmp/non.existant.resolv.conf) kubelet should work, that is a mistake I did in my configuration and as admin I should fixed.

Kubernetes has some requirements, the absence of a resolv.conf is valid for Linux, that doesn’t mean necessarily valid for Kubernetes, and in this case I think that is legit for Kubernetes to require that a resolv.conf file exist.

what if the resolv.conf is added later? this can race … or you have some pods with a dummy resolv.con and others with a good one

That would be a problem with this PR merged.

  1. No resolv.conf file: Pod created with empty resolv.conf file.
  2. the correct resolv.conf is added.
  3. New pod will use the correct resolv.conf. But the old Pods is using the empty resolv.conf unless being recreated.

Currently behavior

  1. No resolv.conf file: Pod cannot be created
  2. the correct resolv.conf is added.
  3. All running pods will use the correct resolv.conf.

However, for scenario that the resolv.conf file is not correct at first and we change it to the correct one, it is also a problem.

discussion was going on the PR #103183 (comment), it should be triaged first though, is not clear this is a bug or a new feature Kubernetes want to support …

I only met similar issues in some private clouds or bare metal private machines. It would be a problem for them. IMO, this may not be a bug, but something we can tolerate.

There are three ways:

  1. like PR #103183, return an empty resolv.conf file.
  2. we can do some pre-check in installers or kubelet start-up.
  3. this is expected behavior.

I prefer option 1 because in some cases the /etc.resolv.conf has no nameservers and the pod can start. File-not-exists is the same to me as no nameservers in the file. However as @aojea points out below, if the /etc.resolv.conf is created or updated later, what is the behavior? (The pod should use the one when the pod was created and should not be updated.)

I will try to create a new resolv file here if err is “file not exist”

Even if we consider this a bug, IMHO, creating /etc/resolv.conf file to the host machine is a bad idea. I think making default resolver configuration is better approach.