minikube: nfs: Failed to resolve server nfs-server.default.svc.cluster.local: Name or service not known
BUG REPORT
Environment:
Minikube version: v0.30.0
- OS: Fedora 29
- VM Driver: virtualbox, kvm2
- ISO version: v0.30.0
- Others:
- kubernetes version: tested on v1.10.0, v1.13.0
- tested with coredns and kube-dns minikube addons
What happened:
NFS volume fails to mount due to DNS error (Failed to resolve server nfs-server.default.svc.cluster.local: Name or service not known). This problem does not occur when deployed on GKE.
What you expected to happen: NFS volume is mounted without an error.
How to reproduce it (as minimally and precisely as possible):
- Start nfs-server:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nfs-server
spec:
replicas: 1
selector:
matchLabels:
role: nfs-server
template:
metadata:
labels:
role: nfs-server
spec:
containers:
- name: nfs-server
image: gcr.io/google_containers/volume-nfs:0.8
ports:
- name: nfs
containerPort: 2049
- name: mountd
containerPort: 20048
- name: rpcbind
containerPort: 111
securityContext:
privileged: true
volumeMounts:
- mountPath: /exports
name: exports
volumes:
- name: exports
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: nfs-server
spec:
ports:
- name: nfs
port: 2049
- name: mountd
port: 20048
- name: rpcbind
port: 111
selector:
role: nfs-server
- Start service consuming the nfs volume (e.g. busybox):
apiVersion: v1
kind: ReplicationController
metadata:
name: nfs-busybox
spec:
replicas: 1
selector:
name: nfs-busybox
template:
metadata:
labels:
name: nfs-busybox
spec:
containers:
- image: busybox
command:
- sh
- -c
- 'while true; do date > /mnt/index.html; hostname >> /mnt/index.html; sleep $(($RANDOM % 5 + 5)); done'
imagePullPolicy: IfNotPresent
name: busybox
volumeMounts:
- name: nfs
mountPath: "/mnt"
volumes:
- name: nfs
nfs:
server: nfs-server.default.svc.cluster.local
path: "/"
Output of minikube logs (if applicable):
In kubectl describe pod nfs-busybox-... is this error:
Warning FailedMount 4m kubelet, minikube MountVolume.SetUp failed for volume "nfs" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/ab2e9ad4-f88b-11e8-8a56-4004c9e1505b/volumes/kubernetes.io~nfs/nfs --scope -- mount -t nfs nfs-server.default.svc.cluster.local:/ /var/lib/kubelet/pods/ab2e9ad4-f88b-11e8-8a56-4004c9e1505b/volumes/kubernetes.io~nfs/nfs
Output: Running scope as unit: run-r23cae2998bf349df8046ac3c61bfe4e9.scope
mount.nfs: Failed to resolve server nfs-server.default.svc.cluster.local: Name or service not known
Which indicates problem with DNS resolution for nfs-server.default.svc.cluster.local.
Note: The NFS is mounted successfully when specified by ClusterIP instead of domain name.
Anything else do we need to know: The same problem was reported already for previous version #2218, but it is closed due to inactivity of the author and no-one seems to really looked into it. There is a workaround for this, but it is required to do it every time a minikube VM is created.
When running kubectl exec -ti nfs-busybox-... -- nslookup nfs-server.default.svc.cluster.local:
Server: 10.96.0.10
Address: 10.96.0.10:53
Name: nfs-server.default.svc.cluster.local
Address: 10.105.22.251
*** Can't find nfs-server.default.svc.cluster.local: No answer
Where strangely the service ClusterIP is present (when using kube-dns the service ClusterIP part is missing completely).
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 18
- Comments: 32 (3 by maintainers)
Commits related to this issue
- set nfs-server IP per env this has to be hardcoded because of https://github.com/kubernetes/minikube/issues/3417 — committed to pace-running/pace by cz8s 4 years ago
@willzhang If you are using NFS CSI driver v4.1.0 or v4.0.0, try changing the
dnsPolicyofcsi-nfs-controllerandcsi-nfs-nodetoClusterFirstWithHostNet, it works for me.I was able to solve this problem by creating a service with a static clusterIP and then mounting to the IP instead of service name. No DNS required. This is working nicely on Azure. I haven’t tried elsewhere
In my case, I’m using an HDFS NFS Gateway and chose
10.0.200.2for the clusterIP@tamalsaha Yes, I have seen it, but there has been posted only a workaround for the issue, not an actual fix.
The problem is that the components responsible for NFS storage backends do not use the cluster internal DNS but try to resolve the NFS server with the DNS information given on the worker node itself. One way to make this work would be to do a hosts-file entry on the worker nodes using (nfs-server.default.svc.cluster.local) and the nfs-server’s ip address. But this is just a quick and dirty hack-around.
But it’s just odd that this component is not able to use the cluster internal DNS resolution. This would make much more sense and be more intuitive to use.
Apologies, I’m not a Minikube user but this is the most apt issue I’ve found for the problems that I’m having.
I’m experiencing these exact problems:
nfs-server.default.svc.cluster.local) doesn’t work during ContainerCreating phasenslookupin there resolves the domain just fine.Based on my googling efforts so far, this seems to be a Kubernetes issue where the NFS is being set up before the container can reach coredns. Perhaps an initialization order problem?
same when use csi-driver-nfs
https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/deploy/example/nfs-provisioner/README.md
For anyone else running into this in general (not only with minikube), I’ve made a small image+daemonset that basically does the later option mentionned above (daemonset updating host’s/etc/systemd/resolved.conf)~~Should work in most scenarios where the cloud provider isn’t doing something too too funky with their DNS config https://github.com/Tristan971/kube-enable-coredns-on-node~~
(bit dirty/ad-hoc in its current state, but could be made to support more hosts setups)EDIT: Brian’s solution, right below, is the best current solution.
well, I’m running into the same issue on EKS as well. By defining the nfs server IP directly, it just works. Is it a known issue on EKS as well? or probably should I go to EFS on AWS? 😦
For anyone else finding themselves in the same situation, who can’t use the
ClusterIPservice, I was also able to get it to work using theNFS CSI Driverlike @fosmjo mentioned above. Apparentlyv4.4.0defaults to the necessarydnsPolicyas well, so no need for configuration beyond their default helm chart. Figured I’d drop a full example for copy pasta.Installed the helm chart from their repo:
I’m running NFS inside my cluster using the
gp2StorageClass to create an EBS-backed volume for my deployment, here’s my template:Lastly, create the StorageClass, PVC, and Deployment that will mount your NFS share:
From what I can tell, the only solution to this would be to have the k8s node have access to k8s’s coredns, which is responsible for resolving these names. However in my experience most k8s nodes use their own dns independent of k8s.