longhorn: [BUG] RWX volume remains attached after workload deleted if it's upgraded from v1.4.2

Describe the bug (šŸ› if you encounter this issue)

RWX volume remains attached and healthy after workload deleted if it’s created in v1.4.2 and then upgraded to master-head or v1.5.x-head.

Directly create/delete a workload using RWX volume in master-head or v1.5.x-head doesn’t have this issue.

To Reproduce

Steps to reproduce the behavior:

  1. Install Longhorn v1.4.2
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.4.2/deploy/longhorn.yaml
  1. Create a statefulset using RWX volume
# rwx_statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-statefulset-rwx
  namespace: default
spec:
  selector:
    matchLabels:
      app: test-statefulset-rwx
  serviceName: test-statefulset-rwx
  replicas: 1
  template:
    metadata:
      labels:
        app: test-statefulset-rwx
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - image: busybox
          imagePullPolicy: IfNotPresent
          name: sleep
          args: ['/bin/sh', '-c', 'while true;do date;sleep 5; done']
          volumeMounts:
            - name: pod-data
              mountPath: /data
  volumeClaimTemplates:
    - metadata:
        name: pod-data
      spec:
        accessModes: ['ReadWriteMany']
        storageClassName: 'longhorn'
        resources:
          requests:
            storage: 1Gi

# kubectl apply -f rwx_statefulset.yaml
  1. Upgrade Longhorn to master-head or v1.5.x-head, and upgrade the engine image of the RWX volume
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/longhorn.yaml
  1. Delete the statefulset
kubectl delete -f rwx_statefulset.yaml
  1. The volume remains attached and healthy after workload deleted

Expected behavior

A clear and concise description of what you expected to happen.

Log or Support bundle

supportbundle_7c27605c-22df-493e-9e2a-b9135c68e20b_2023-06-16T02-29-21Z.zip

Environment

  • Longhorn version:
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl):
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version:
    • Number of management node in the cluster:
    • Number of worker node in the cluster:
  • Node config
    • OS type and version:
    • CPU per node:
    • Memory per node:
    • Disk type(e.g. SSD/NVMe):
    • Network bandwidth between the nodes:
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal):
  • Number of Longhorn volumes in the cluster:

Additional context

Add any other context about the problem here.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 24 (23 by maintainers)

Most upvoted comments

Thanks @innobead and @shuo-wu for the great feedbacks!

I agree with points 1 and 2 @shuo-wu mentioned above (which I understand that they are also the points that @innobead proposed. Please correct me if I am understand it wrong @innobead )

For the point 2 that @shuo-wu mentioned:

To be honest, I prefer Derek’s suggestion: Just directly creating (multiple) CSI-attacher tickets for volumes used by workload pods.

Let’s me evaluate more to see which one is better option: using workload pod state VS using Kuberntes VolumeAttachment state

  1. I agree with using the correct/valid AD ticket type as well and removing type longhorn-upgrade.
  2. To be honest, I prefer Derek’s suggestion: Just directly creating (multiple) CSI-attacher tickets for volumes used by workload pods. As for the race condition you mentioned, I think the main blocker is the existing CSI plugin that may handle the detachment calls incorrectly (with the old way). If we can remove the old CSI plugin before jumping in the upgrade path, will everything work fine? The new Longhorn CSI driver deployer will be deployed when new longhorn manager is up hence we don’t need to worry about it.
  3. For the attached volumes that has no workload pod but containning non-empty spec node ID, I think we can add AttacherTypeLonghornAPI. But for the auto-attached volumes, it’s too complicated to generate correct AD tickets hence we can ignore them.

You are right @derekbit it is already exist. The behavior just a little bit different but there is still same issue. Thank you for the clarification

Btw, this issue is side effect of this fix https://github.com/longhorn/longhorn-manager/pull/1993/files. We fixed the detaching issue but introduced the stuck attaching issue as a side effect šŸ˜„

Clarify it a bit. It is not a side effect of the detaching fix. The hasActiveWorkload was already removed from AD controller implantation before. I mistakenly introduced it into longhorn-manager before. It is not related to detaching issue, so I removed it in the end.

Hence, both RWX volumes attaching and detaching issues exist in AD controller design and implementation.

Just for information. It still remains attached and healthy after a more than 3 hours waiting.

Verified pass in

  • longhorn master(longhorn-manager ec130d)
  • longhorn v1.5.x(longhorn-manager c8092b)

After upgrade from v1.4.2 to master and upgrade from v1.4.2 to v1.5.x, perform test steps were passed, delete workload, the RWX volume become detached as well.

Point 3 I mentioned was similar to what David suggested but with extra concerns about the auto-attached volumes.

@PhanLe1010 This seems a blocker for 1.5.0. Let’s make this the highest priority to tackle first. Thanks.

I see. This case can happen when there is no workload pod on the same node with the share manager pod. When user upgrade to 1.5.x, we create an upgrade AD ticket for the share manager’s node to keep volume attached there. When the user scale down the workload, we don’t cleanup that upgrade AD ticket because that ticket is on different node as the workload pod’s node. As the result, no one is cleaning up the upgrade AD ticket and the volume stuck there forever.

Still figuring out how to fix this issue.