longhorn: [BUG] Failing to mount encrypted volumes v1.5.2

Describe the bug (🐛 if you encounter this issue)

Encrypted volumes work perfectly in v1.5.1 now do not in v1.5.2.

Error message from pod

Normal   Scheduled           24m                 default-scheduler        Successfully assigned flask/redis-6db4847b89-7xrq8 to frame2.rfed.me
  Warning  FailedAttachVolume  34s (x19 over 23m)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-584281a4-60cd-403b-a522-d12a0a15eab2" : rpc error: code = Internal desc = volume pvc-584281a4-60cd-403b-a522-d12a0a15eab2 failed to attach to node frame2.rfed.me with attachmentID csi-894ac457228fd8e53fbe692158f44ecdb522e03727fce491e8715a1c093ee7cf: Waiting for volume share to be available
  Warning  FailedMount         8s (x11 over 22m)   kubelet                  Unable to attach or mount volumes: unmounted volumes=[redis-data], unattached volumes=[redis-data kube-api-access-sf9sd]: timed out waiting for the condition

To Reproduce

deploy longhorn v1.5.2 add storage class https://github.com/clemenko/k8s_yaml/blob/master/longhorn_encryption.yml deploy app that needs it https://github.com/clemenko/fleet/blob/main/flask/flask.yaml

Expected behavior

volume mount correctly.

Support bundle for troubleshooting

attaching soon

Environment

  • Longhorn version: v.1.5.2
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): helm
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: rke2
    • Number of management node in the cluster: 1
    • Number of worker node in the cluster: 2
  • Node config
    • OS type and version: Rocky 9
    • Kernel version:
    • CPU per node: 4
    • Memory per node: 8
    • Disk type(e.g. SSD/NVMe/HDD): SSD
    • Network bandwidth between the nodes: 1 gigawatt
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Digital Ocean
  • Number of Longhorn volumes in the cluster: 1
  • Impacted Longhorn resources:
    • Volume names:

Additional context

This works in v1.5.1 and it broke in v1.5.2.

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Reactions: 1
  • Comments: 17 (9 by maintainers)

Most upvoted comments

that worked!

Can we do direct query any existing secret in any non longhorn-system namespace as a fix instead of rolling back the implementation. Also, we need to have a note for this behavior change.

Investigating this part. This is a better solution.

BTW, for workaround, where does the CSI node driver get the secret after the volume provisioned? Can users do some change there as a workaround? I believe that info should be saved in a resource like PV.

The fields in persistentvolume resource are immutable. One possible workaround for an existing volume is replacing (delete and recreate) pv.

  1. Scale down the workload
  2. Create a secret in longhorn-system namespace
  3. Executekubectl get pv <pv name> -o yaml > pv.yaml
  4. Update the spec.csi.nodePublishSecretRef.namespace and spec.csi.nodeStageSecretRef.namespace to longhorn-system in pv.yaml
  5. Execute kubectl replace --cascade=false --force -f pv.yaml
  6. In another terminal, kubectl edit pv <pv name>, then remove the finalizer. The replacement should succeed.
  7. Scale up the workload

Having the secret per volume is vital for a multi-tenant situation.

2023-11-05T14:59:54.108520707Z time="2023-11-05T14:59:54Z" level=error msg="Failed to sync Longhorn share manager" func=controller.handleReconcileErrorLogging file="utils.go:72" ShareManager=longhorn-system/pvc-d4f17cad-5773-431f-b9f2-ecba2a7b1a46 controller=longhorn-share-manager error="failed to sync longhorn-system/pvc-d4f17cad-5773-431f-b9f2-ecba2a7b1a46: failed to create pod for share manager: secret \"redis\" not found" node=rke2

The issue is due to https://github.com/longhorn/longhorn/issues/6954. We only sync the secrets in longhorn-system namespace for avoiding high memory consumption of secrets, configmap and so on in other namespaces. However, share-manager pod will get the secret of the pv.Spec.CSI.NodePublishSecretRef (https://github.com/longhorn/longhorn-manager/blob/master/controller/share_manager_controller.go#L756)… We didn’t notice this.

The workaround for a new volume is creating secrets in longhorn-system and set the xxx-secret-namespace in a storageclass to longhorn-system rather than ${pvc.namespace}.

  parameters:
    csi.storage.k8s.io/node-publish-secret-name: ${pvc.name}
    csi.storage.k8s.io/node-publish-secret-namespace: ${pvc.namespace}
    csi.storage.k8s.io/node-stage-secret-name: ${pvc.name}
    csi.storage.k8s.io/node-stage-secret-namespace: ${pvc.namespace}
    csi.storage.k8s.io/provisioner-secret-name: ${pvc.name}
    csi.storage.k8s.io/provisioner-secret-namespace: ${pvc.namespace}

However, for existing encrypted RWX volumes, it seems no workaround. We need to roll back the change for secret. cc @innobead

Having the secret per volume is vital for a multi-tenant situation.

This is fair.

Then, we should check if getting the secret directly from API server instead of relying on caches in this case instead of just rolling back the new implementation to cause any potential performance issues/concerns if it does matter.