longhorn: [BUG] longhorn RWX doesn't work on freshly created volume

Describe the bug (🐛 if you encounter this issue)

I made a new Longhorn RWX volume. The pods that reference it fail to start.

To Reproduce

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: rwx
  name: rwx
  namespace: wordpress
spec:
  replicas: 2
  selector:
    matchLabels:
      app: rwx
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: rwx
    spec:
      containers:
      - image: nginx:latest
        name: nginx
        resources: {}
        volumeMounts:
        - mountPath: /data
          name: data
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: rwx
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rwx
  namespace: wordpress
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: longhorn-rwx
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn-rwx
parameters:
  fromBackup: ""
  migratable: "true"
  numberOfReplicas: "3"
  recurringJobSelector: '[{"name":"backup","task":"backup","cron":"5 0 * * *","retain":14}]'
  staleReplicaTimeout: "2880"
provisioner: driver.longhorn.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

Expected behavior

My pods work with the rwx volume I want.

Support bundle for troubleshooting

Too big for GitHub, but now here as of this edit: support bundle

Environment

Longhorn version: 1.5.1
Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Helm through an ArgoCD Application
Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: k3s 1.27.3
- Number of management node in the cluster: 5
- Number of worker node in the cluster: 0
Node config
- OS type and version: Almalinux 9.2
- Kernel version: 5.14.0-284.25.1.el9_2.x86_64
- CPU per node: 16
- Memory per node: 64 GB
- Disk type(e.g. SSD/NVMe/HDD): NVMe OS, SSD for Longhorn Data
- Network bandwidth between the nodes: 2.5 Gbps
Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Baremetal
Number of Longhorn volumes in the cluster: 45

Additional context

The ArgoCD application that installed longhorn:

---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "1"
  finalizers:
    - resources-finalizer.argocd.argoproj.io
  name: longhorn
  namespace: argocd
spec:
  destination:
    namespace: longhorn-system
    server: 'https://kubernetes.default.svc'
  ignoreDifferences:
    - group: apps
      kind: Deployment
      name: longhorn-admission-webhook
    - group: apps
      kind: Deployment
      name: longhorn-conversion-webhook
    - group: apps
      kind: Deployment
      name: longhorn-recovery-backend
  project: default
  source:
    chart: longhorn
    helm:
      releaseName: longhorn
      values: |
        csi:
          kubeletRootDir: "/var/lib/kubelet"
        defaultSettings:
          backupTarget: "s3://longhorn@us-east-1/"
          backupTargetCredentialSecret: "longhorn-minio-creds"
          defaultDataLocality: "best-effort"
          replicaAutoBalance: "best-effort"
        longhornAdmissionWebhook:
          replicas: 0
        longhornConversionWebhook:
          replicas: 0
        longhornManager:
          log:
            format: json
        longhornRecoveryBackend:
          replicas: 0
        persistence:
          defaultDataLocality: "best-effort"
          defaultReplicaAutoBalance: "best-effort"
          migratable: true
          recurringJobSelector:
            enable: true
            jobList: '[{"name":"backup","task":"backup","cron":"5 0 * * *","retain": 14}]'
    repoURL: 'https://charts.longhorn.io'
    targetRevision: "*"
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

About this issue

Original URL
State: closed
Created 10 months ago
Comments: 15 (7 by maintainers)

Commits related to this issue

fix: set large volumes to not be migratable Set large Longhorn volumes to not be migratable in an attempt to fix RWX volume mount issues. Relevant issue upstream: longhorn/longhorn#6595 — committed to d3adb5/app-of-apps by d3adb5 4 months ago

Most upvoted comments

You probably refer to the examples here in this folder https://github.com/longhorn/longhorn/tree/master/examples/rwx But you just need to use the rwx-nginx-deployment.yaml We will move storageclass-migratable.yaml out of the folder for it is indeed quite confused

ChanYiLin on Sep 1, 2023

kubectl -n longhorn-system get supportbundles

NAME                                  STATE              ISSUE                                              DESCRIPTION                                                     AGE
support-bundle-2023-08-27t07-50-13z   ReadyForDownload   https://github.com/longhorn/longhorn/issues/6595   Support bundle for my Longhorn install for Github issue #6595   20h

Now, if you read above, at the time of the comment, I could not, because the window sat at 20 percent and I could not do anything. The status also was not ReadyForDownload. So, at the time, this was a silly ask, because I could not comply.

Checking anew now, though, it seems it finally made one. A nearly 1 GB file, from the looks. Guessing it just took a real long time.

So… Github only allows files up to 25 MB and this is 991 MB. I have to upload to alternate location. Uploaded here: https://nextcloud.thecrimsontint.com/s/pnwYxTCxke9WP9b

jstewart612 on Aug 28, 2023

for support bundle issue cc @c3y1huang

derekbit on Aug 27, 2023