rook: Ganesha NFS export stopped working in rook version 1.4.x

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior:

In the rook previous versions until 1.3.9 (with ceph 14.2.6) I have always followed the ceph nfs manual to create a NFS export via NFS-GANESHA and there was no problem.

As I have tested both on rook 1.4.0 and 1.4.1 operator with ceph 15.2.4 the HTTP POST request to the dashboard API will return status 400 Cluster not found: cluster_id=my-nfs.

The mgr dashboard (Web UI) in the NFS->Create->Create NFS export will show No cluster available in the Cluster section.

So practically, I am out of options to create a NFS export for my rook cluster after upgrading to rook 1.4.1.

Expected behavior:

Ceate NFS export for a ceph shared filesystem by following the ceph manual.

How to reproduce it (minimal and precise):

Install rook version 1.4.1 helm chart
Create a ceph cluster
Create a shared filesystem
Create a rook NFS CRD
export shared filesystem via Ganesha NFS according to ceph manual

File(s) to submit:

Cluster CR (custom resource), typically called cluster.yaml, if necessary

apiVersion: ceph.rook.io/v1                                                                                                                                                                    
kind: CephCluster                                                                                                                                                                              
metadata:                                                                                                                                                                                      
  name: rook-ceph                                                                                                                                                                              
  namespace: rook-ceph                                                                                                                                                                         
spec:                                                                                                                                                                                          
  cephVersion:                                                                                                                                                                                 
    image: ceph/ceph:v15.2.4
    allowUnsupported: true
  dataDirHostPath: /ceph
  skipUpgradeChecks: false
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  mon:
    count: 1
    allowMultiplePerNode: true
  dashboard:
    enabled: True
    ssl: false
  monitoring:
    enabled: false
    rulesNamespace: rook-ceph
  network:
    hostNetwork: false
  rbdMirroring:
    workers: 0
  crashCollector:
    disable: false
  mgr:
    modules:
    - name: pg_autoscaler
      enabled: true
  storage:
    config:
      osdsPerDevice: '1'
    devices:
    - name: vdb
    useAllDevices: false
    useAllNodes: true

fs.yml:

apiVersion: ceph.rook.io/v1                                                                                                                                                                    
kind: CephFilesystem                                                                                                                                                                           
metadata:                                                                                                                                                                                      
  name: myfs                                                                                                                                                                                
  namespace: rook-ceph                                                                                                                                                                         
spec:                                                                                                                                                                                          
  metadataPool:                                                                                                                                                                                
    replicated:                                                                                                                                                                                
      size: 1                                                                                                                                                                                  
  dataPools:                                                                                                                                                                                   
    - failureDomain: host                                                                                                                                                                      
      replicated:
        size: 1
  preservePoolsOnDelete: true
  metadataServer:
    activeCount: 1
    activeStandby: true
    placement:
       podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - rook-ceph-mds
            topologyKey: kubernetes.io/hostname
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - rook-ceph-mds
              topologyKey: topology.kubernetes.io/zone

nfs.yml:

apiVersion: ceph.rook.io/v1
kind: CephNFS
metadata:
  name: my-nfs
  namespace: rook-ceph
spec:
  rados:
    pool: myfs-data0
    namespace: nfs-ns
  server:
    active: 1
    placement:
    resources:
    priorityClassName:

kubectl -n rook-ceph logs rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j -c nfs-ganesha

24/08/2020 15:20:18 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] main :MAIN :EVENT :nfs-ganesha Starting: Ganesha Version 3.3
24/08/2020 15:20:19 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
24/08/2020 15:20:19 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
24/08/2020 15:20:19 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
24/08/2020 15:20:19 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
24/08/2020 15:20:19 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] rados_kv_traverse :CLIENT ID :EVENT :Failed to lst kv ret=-2
24/08/2020 15:20:19 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] rados_cluster_read_clids :CLIENT ID :EVENT :Failed to traverse recovery db: -2
24/08/2020 15:20:22 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] rados_cluster_end_grace :CLIENT ID :EVENT :Failed to remove rec-0000000000000001:my-nfs.a: -2
24/08/2020 15:20:22 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
24/08/2020 15:20:22 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] main :NFS STARTUP :WARN :No export entries found in configuration file !!!
24/08/2020 15:20:22 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] lower_my_caps :NFS STARTUP :EVENT :CAP_SYS_RESOURCE was successfully removed for proper quota management in FSAL
24/08/2020 15:20:22 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] lower_my_caps :NFS STARTUP :EVENT :currenty set capabilities are: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+eip
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_Init_svc :DISP :CRIT :Cannot acquire credentials for principal nfs
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] __Register_program :DISP :MAJ :Cannot register NFS V4 on UDP
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_Init_admin_thread :NFS CB :EVENT :Admin thread initialized
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_rpc_cb_init_ccache :NFS STARTUP :EVENT :Callback creds directory (/var/run/ganesha) already exists
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_rpc_cb_init_ccache :NFS STARTUP :WARN :gssd_refresh_krb5_machine_credential failed (-1765328160:22)
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_Start_threads :THREAD :EVENT :Starting delayed executor.
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_Start_threads :THREAD :EVENT :gsh_dbusthread was started successfully
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_Start_threads :THREAD :EVENT :admin thread was started successfully
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_Start_threads :THREAD :EVENT :reaper thread was started successfully
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_Start_threads :THREAD :EVENT :General fridge was started successfully
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_start :NFS STARTUP :EVENT :-------------------------------------------------
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_start :NFS STARTUP :EVENT :             NFS SERVER INITIALIZED
24/08/2020 15:21:50 : epoch 5f43dab2 : rook-ceph-nfs-my-nfs-a-587bcd5ffb-n8n8j : nfs-ganesha-1[main] nfs_start :NFS STARTUP :EVENT :-------------------------------------------------

Environment:

OS (e.g. from /etc/os-release): Ubuntu 20.04
Kernel (e.g. uname -a): Linux admirito 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Cloud provider or hardware configuration:
Rook version (use rook version inside of a Rook Pod): 1.4.1
Storage backend version (e.g. for ceph do ceph -v): 15.2.4 octopus (stable)
Kubernetes version (use kubectl version): 1.18.4
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): private cluster with kubespray
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): HEALTH_OK

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 21 (9 by maintainers)

Commits related to this issue

doc/cephfs/nfs: Remove outdated doc related to rook The doc is outdated because the mentioned orch commands have changed[1] and using dashboard backend script to create exports is buggy[2]. [1] http... — committed to varshar16/ceph by varshar16 4 years ago
doc/cephfs/nfs: Remove outdated doc related to rook The doc is outdated because the mentioned orch commands have changed[1] and using dashboard backend script to create exports is buggy[2]. [1] http... — committed to cyx1231st/ceph by varshar16 4 years ago
doc/cephfs/nfs: Remove outdated doc related to rook The doc is outdated because the mentioned orch commands have changed[1] and using dashboard backend script to create exports is buggy[2]. [1] http... — committed to Rethan/ceph by varshar16 4 years ago
doc/cephfs/nfs: Remove outdated doc related to rook The doc is outdated because the mentioned orch commands have changed[1] and using dashboard backend script to create exports is buggy[2]. [1] http... — committed to ceph/ceph-ci by varshar16 4 years ago

Most upvoted comments

This is an issue in the Ceph Dashboard. To be clear, see the attached screenshot where the NFS cluster does not show anything in the list.

This issue repros for me in a test cluster with the following combinations:

Rook v1.4 with Ceph v14 or v15
Rook v1.3 with Ceph v15

This issue does not repro with this combination:

Rook v1.3 with Ceph v14

@jtlayton Do you know how the NFS cluster is populated in the UI or who can help us out here?

Screen Shot 2020-08-25 at 4 24 50 PM

travisn on Aug 25, 2020