kubernetes: NFS PV tests are failing on GCI

Relevant test failures: https://github.com/kubernetes/kubernetes/issues/33401

I ran the e2e that “create a PV and a pre-bound PVC: test write access” and it fails on GCI. I debugged a bit and figured that the mount command is failing -

Sep 24 21:17:31.083: INFO: At 2016-09-24 21:11:05 -0700 PDT - event for write-pod-1mzyy: {kubelet e2e-test-vishnuk-minion-group-jo
4u} FailedMount: MountVolume.SetUp failed for volume "kubernetes.io/nfs/13fd01f0-82d6-11e6-959f-42010af00002-nfs-1nmbf" (spec.Name
: "nfs-1nmbf") pod "13fd01f0-82d6-11e6-959f-42010af00002" (UID: "13fd01f0-82d6-11e6-959f-42010af00002") with: mount failed: exit s
tatus 32
Mounting arguments: 10.180.0.10:/exports /var/lib/kubelet/pods/13fd01f0-82d6-11e6-959f-42010af00002/volumes/kubernetes.io~nfs/nfs-
1nmbf nfs []
Output: mount: wrong fs type, bad option, bad superblock on 10.180.0.10:/exports,
       missing codepage or helper program, or other error
       (for several filesystems (e.g. nfs, cifs) you might
       need a /sbin/mount.<type> helper program)

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

Looking at the kernel logs, I got the following information:

Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: RPC: Registered named UNIX socket transport module.
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: RPC: Registered udp transport module.
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: RPC: Registered tcp transport module.
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: RPC: Registered tcp NFSv4.1 backchannel transport module.
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: NFSD: attempt to initialize umh client tracking in a container ignored.
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: NFSD: attempt to initialize legacy client tracking in a container ignored.
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: NFSD: Unable to initialize client recovery tracking! (-22)
Sep 25 04:10:26 e2e-test-vishnuk-minion-group-jo4u kernel: NFSD: starting 10-second grace period (net ffff8801e8941840)

I looked at the NFS server logs and found the following:

Serving /exports
rpcinfo: can't contact rpcbind: : RPC: Unable to receive; errno = Connection refused
Starting rpcbind
exportfs: /exports does not support NFS export
NFS started

I’d like to know if this is a test setup issue or a base image issue? NFS has been confirmed to be working on GCI by other users of GCI. Even then, I’d like to understand the reason for the NFS PV test failure in GCI.

cc @matchstick @saad-ali @kubernetes/sig-storage @thockin @Amey-D

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 57 (36 by maintainers)

Commits related to this issue

Most upvoted comments

If you are using the examples/volumes/nfs, please do the following changes.

Edit examples/volumes/nfs/nfs-pv.yaml change the last line to path: “/”

Edit examples/volumes/nfs/nfs-server-rc.yaml change the image to the one that enabled NFSv4 image: gcr.io/google_containers/volume-nfs:0.8

The fixes are underway (< 1 week).

On Sun, Oct 16, 2016 at 7:18 AM, Karl Stoney notifications@github.com wrote:

+1, this is lame 😦

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/33447#issuecomment-254049597, or mute the thread https://github.com/notifications/unsubscribe-auth/AGvIKFcsFjOJ4S_9AVJluPr1ATW5cWi6ks5q0jJPgaJpZM4KF0Th .

For me, using the gcr.io/google_containers/volume-nfs:0.8 as @jingxu97 said, did the trick.

Thx a lot

@vishh any news about the fix?

@jingxu97 Where do we use “/” instead of “/exports” ? Won’t that mount the root directory?

The support for NFS is also back ported to release 1.4.7 which should be available soon. We also have some NFS tests running currently. Thanks and please let me know if you have any issue when using NFS.

On Mon, Dec 5, 2016 at 3:10 PM, AndryBray notifications@github.com wrote:

@vishh https://github.com/vishh thank you for your suggestion but at the moment I’m not able to setup a new env. I can just try it with k8s on GC platform. Is there any way to enable it there? I see the latest version is 1.4.6.

Nobody tested it in the meanwhile?

Thank you

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/33447#issuecomment-265008384, or mute the thread https://github.com/notifications/unsubscribe-auth/ASSNxSRTdAn9axGruPQJSkF04juTw5gtks5rFJnvgaJpZM4KF0Th .

  • Jing

Hi @matchstick do you think you’ll be able to release it before the new year? It would be a great news because many projects are blocked because of this issue.

Thank you for you work

@Stono We are actively working on this and I think we have a solution that is going through final testing for getting this to work on GCI. Trust me it is on the top of the stack for a few engineers right now. @jingxu97 is leading this effort, and she can hopefully give an update of the status of this.

If you are curious for more details please come to the storage sig upcoming or email the storage sig email list we are going to discuss it in detail. For now Debian is the prescribed workaround.

We have plans to make improvements in this area on several dimensions and we are very sorry for the inconvenience right now.

+1 For the NFS fix, also it would be nice if the GKE team had a more detailed release page for the base image releases and features for the container-vm and gci images. I used the container-vm workaround to maintain NFS support but the lack of info on which images had the CVE-2016-5195 patch was a little painful. Would like to see more details moving forward.