ceph-csi: CephFS mount syntax not updated for Quincy

Describe the bug

Apparently there was a significant change in the mount.ceph syntax between Ceph Pacific and Quincy. However Ceph-CSI code does not seem to be updated to support the new syntax.

I use Nomad 1.3.1 and I am trying to use Ceph-CSI to provide CephFS-based volumes to Nomad jobs. I tried the 3.6.2 version of Ceph-CSI (which is already based on Quincy) to mount a CephFS volume from a cluster running Ceph 17.2.0.

I use Nomad instead of Kubernetes, but I don’t think this fact affects this bug.

Environment details

  • Image/version of Ceph CSI driver : 3.6.2
  • Helm chart version : N/A
  • Kernel version : 5.15.41-0-lts
  • Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its krbd or rbd-nbd) : kernel
  • Kubernetes cluster version : N/A
  • Ceph cluster version : 17.2.0

Steps to reproduce

Steps to reproduce the behavior:

  1. Setup Nomad 1.3.x (can run in dev mode) and Ceph 17.2
  2. In Ceph, create a CephFS called nomadfs and admin user
  3. Deploy CSI Controller Plugin job using: nomad job run ceph-csi-plugin-controller.nomad
  4. Deploy CSI Node Plugin job using: nomad job run ceph-csi-plugin-nodes.nomad
  5. Deploy sample-fs-volume.hcl by running: nomad volume register sample-fs-volume.hcl
  6. Deploy mysql-fs.nomad which tries to use the volume created in previous step using: nomad job run mysql-fs.nomad.
  7. Observe error in ceph-mysql-fs job allocation logs.

ceph-csi-plugin-controller.nomad:

job "ceph-fs-csi-plugin-controller" {
  datacenters = ["dc1"]
 
  group "controller" {
    network {
      port "metrics" {}
    }

    task "ceph-controller" {
      driver = "docker"

      template {
        data = jsonencode([{
          clusterID = "67b72852-d1b8-45ad-b1f8-edb8c150ff9b"
          monitors  = ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
        }])
        destination = "local/config.json"
        change_mode = "restart"
      }

      config {
        image       = "quay.io/cephcsi/cephcsi:v3.6.2"
        volumes      = [
          "./local/config.json:/etc/ceph-csi-config/config.json"
        ]

        mounts = [
          {
            type          = "tmpfs"
            target        = "/tmp/csi/keys"
            readonly      = false
            tmpfs_options = {
              size = 1000000 # size in bytes
            }
          }
        ]

        args = [
          "--type=cephfs",
          "--controllerserver=true",
          "--drivername=cephfs.csi.ceph.com",
          "--endpoint=unix://csi/csi.sock",
          "--nodeid=${node.unique.name}",
          "--instanceid=${node.unique.name}-controller",
          "--pidlimit=-1",
          "--logtostderr=true",
          "--v=5",
          "-stderrthreshold=0",
          "--metricsport=$${NOMAD_PORT_metrics}"
        ]
      }

      resources {
        cpu    = 500
        memory = 256
      }

      service {
        name     = "ceph-fs-csi-controller"
        port     = "metrics"
        tags     = [ "prometheus" ]
      }

      csi_plugin {
        id        = "ceph-fs-csi"
        type      = "controller"
        mount_dir = "/csi"
      }
    }
  }
}

ceph-csi-plugin-nodes.nomad:

job "ceph-fs-csi-plugin-nodes" {
  datacenters = ["dc1"]
  type        = "system"

  group "nodes" {
    network {
      port "metrics" {}
    }

    task "ceph-node" {
      driver = "docker"

      template {
        data = jsonencode([{
          clusterID = "67b72852-d1b8-45ad-b1f8-edb8c150ff9b"
          monitors  = ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
        }])
        destination = "local/config.json"
        change_mode = "restart"
      }

      config {
        mount {
          type          = "tmpfs"
          target        = "/tmp/csi/keys"
          readonly      = false
          tmpfs_options = {
            size = 1000000 # size in bytes
          }
        }

        mount {
          type     = "bind"
          source   = "/lib/modules/${attr.kernel.version}"
          target   = "/lib/modules/${attr.kernel.version}"
          readonly = true
        }
       
        image       = "quay.io/cephcsi/cephcsi:v3.6.2"
        privileged  = true
        volumes     = [
          "./local/config.json:/etc/ceph-csi-config/config.json"
        ]
        args = [
          "--type=cephfs",
          "--drivername=cephfs.csi.ceph.com",
          "--nodeserver=true",
          "--endpoint=unix://csi/csi.sock",
          "--nodeid=${node.unique.name}",
          "--instanceid=${node.unique.name}-nodes",
          "--pidlimit=-1",
          "--logtostderr=true",
          "--v=5",
          "--metricsport=$${NOMAD_PORT_metrics}"
        ]
      }

      resources {
        cpu    = 500
        memory = 256
      }

      service {
        name = "ceph-fs-csi-nodes"
        port = "metrics"
        tags = [ "prometheus" ]
      }

      csi_plugin {
        id        = "ceph-fs-csi"
        type      = "node"
        mount_dir = "/csi"
      }
    }
  }
}

sample-fs-volume.hcl:

id           = "ceph-mysql-fs"
name         = "ceph-mysql-fs"
type         = "csi"
plugin_id    = "ceph-fs-csi"
external_id  = "nomadfs"

capability {
  access_mode     = "multi-node-multi-writer"
  attachment_mode = "file-system"
}

secrets {
  adminID  = "admin"
  adminKey = "AQDKpPtiDr30NRAAsqtMLh0WHUqZ0L4f2S/ouA=="
  userID  = "admin"
  userKey = "AQDKpPtiDr30NRAAsqtMLh0WHUqZ0L4f2S/ouA=="
}

parameters {
  clusterID = "67b72852-d1b8-45ad-b1f8-edb8c150ff9b"
  fsName    = "nomadfs"
}

context {
  monitors  = "192.168.1.10,192.168.1.11,192.168.1.12"
  provisionVolume = "false"
  rootPath = "/"
}

mysql-fs.nomad:

variable "mysql_root_password" {
  description = "Password for MySQL root user"
  type = string
  default = "password"
}

job "mysql-server-fs" {
  datacenters = ["dc1"]
  type        = "service"

  group "mysql-server-fs" {
    count = 1
    volume "ceph-mysql-fs" {
      type      = "csi"
      attachment_mode = "file-system"
      access_mode     = "multi-node-multi-writer"
      read_only = false
      source    = "ceph-mysql-fs"
    }
    network {
      port "db" {
        static = 3306
      }
    }
    restart {
      attempts = 10
      interval = "5m"
      delay    = "25s"
      mode     = "delay"
    }
    task "mysql-server" {
      driver = "docker"
      volume_mount {
        volume      = "ceph-mysql-fs"
        destination = "/srv"
        read_only   = false
      }
      env {
        MYSQL_ROOT_PASSWORD = "${var.mysql_root_password}"
      }
      config {
        image = "hashicorp/mysql-portworx-demo:latest"
        args  = ["--datadir", "/srv/mysql"]
        ports = ["db"]
      }
      resources {
        cpu    = 500
        memory = 1024
      }
      service {
        provider = "nomad"
        name = "mysql-server"
        port = "db"
      }
    }
  }
}

Actual results

Ceph-CSI node plugin failed to mount CephFS.

Expected behavior

Ceph-CSI node plugin should successfully mount CephFS using the new mount.ceph syntax.

Logs

nomad alloc status events:

Recent Events:
Time                       Type           Description
2022-08-16T12:32:18+02:00  Setup Failure  failed to setup alloc: pre-run hook "csi_hook" failed: node plugin returned an internal error, check the plugin allocation logs for more information: rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 192.168.1.10,192.168.1.11,192.168.1.12:/ /local/csi/staging/ceph-mysql-fs/rw-file-system-multi-node-multi-writer -o name=admin,secretfile=/tmp/csi/keys/keyfile-2337295656,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon
2022-08-16T10:31:57.974+0000 7fdda8b9df40 -1 failed for service _ceph-mon._tcp
mount error: no mds server is up or the cluster is laggy

I suspect the unable to get monitor info from DNS SRV error happens because the mount.ceph helper in 17.x does not recognize anymore passing monitor IPs this way and falls back to using DNS SRV records.

Additional context

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 4
  • Comments: 34

Most upvoted comments

This seems to be an active issue, where the only workaround is downgrading cephcsi. There is also an open PR for it. Should it be reopened?

As a data point, it affects me too:

an error (exit status 234) occurred while running mount args: [-t ceph v2:192.168.61.11:3300/0,v1:192.168.61.11:6789/0,v2:192.168.61.12:3300/0,v1:192.168.61.12:6789/0:/volumes/csi/csi-vol-192d6009-bfe4-4410-8c7a-c65ca73d5d1e/2917ff3d-8acd-48b5-aab8-0f1f876b4266 /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/fbbcc818e6f7c6ec9395d4bb29b400ede4fc12b6d18c93c138f2c96772be1d83/globalmount -o name=kubernetes07,secretfile=/tmp/csi/keys/keyfile-634581935,mds_namespace=kubernetes07,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon

Changing the CSI ConfigMap fixes it: from:

"monitors": [
  "v2:192.168.61.11:3300/0,v1:192.168.61.11:6789/0",
  "v2:192.168.61.12:3300/0,v1:192.168.61.12:6789/0"
]

to:

"monitors": [
  "192.168.61.11",
  "192.168.61.12"
]

Someone has to make the first report 😉 maybe this use-case is not very popular?

Is your ceph cluster healthy?

Yes.

Have you retried running the mount command manually from the cephfsplugin container?

Yes, I get the same error:

$ docker exec -ti ceac87b3d9fe bash
[root@ceac87b3d9fe /]# echo 'AQDKpPtiDr30NRAAsqtMLh0WHUqZ0L4f2S/ouA==' > /tmp/csi/keys/admin.key
[root@ceac87b3d9fe /]# mount -t ceph 192.168.1.10,192.168.1.11,192.168.1.12:/ /mnt -o 'name=admin,secretfile=/tmp/csi/keys/admin.key,_netdev,fs=nomadfs'
unable to get monitor info from DNS SRV with service name: ceph-mon
2022-08-17T08:12:46.934+0000 7fceff0def40 -1 failed for service _ceph-mon._tcp
[root@ceac87b3d9fe /]# 

-t ceph 192.168.1.10,192.168.1.11,192.168.1.12:/ Have you tried specifying the monitor port?

Yes, same error:

[root@ceac87b3d9fe /]# mount -t ceph 192.168.1.10:6789,192.168.1.11:6789,192.168.1.12:6789:/ /mnt -o 'name=admin,secretfile=/tmp/csi/keys/admin.key,_netdev,fs=nomadfs'
unable to get monitor info from DNS SRV with service name: ceph-mon
2022-08-17T08:12:46.934+0000 7fceff0def40 -1 failed for service _ceph-mon._tcp
[root@ceac87b3d9fe /]# 

When I try to mount using the Quincy mount.ceph syntax it works:

[root@ceac87b3d9fe /]# mount -t ceph admin@67b72852-d1b8-45ad-b1f8-edb8c150ff9b.nomadfs=/ /mnt -o 'secretfile=/tmp/csi/keys/admin.key,_netdev,mon_addr=192.168.1.10/192.168.1.11/192.168.1.12'
[root@ceac87b3d9fe /]# ls -la /mnt
total 4
drwxr-xr-x 3 root root    1 Aug 12 08:41 .
drwxr-xr-x 1 root root 4096 Aug 17 08:07 ..
drwxr-xr-x 3 root root    2 Aug 12 08:41 volumes
[root@ceac87b3d9fe /]#

EDIT:

I just tested with quay.io/cephcsi/cephcsi:v3.5.1 (which is based on Pacific) and the mount commands which failed previously do work there.