longhorn: [QUESTION] After attaching volume in maintenance mode, the volume can't no longer be attached or detached

Describe the bug After attaching volume in maintenance mode, the volume can’t no longer be attached or detached. I cannot see the /dev/longhorn/ device for the volume.

To Reproduce Steps to reproduce the behavior:

  1. Downscale deployment to 0 (volume will be detached), Attach volume with maintenance mode. Create snapshot. Scale deployment back to desired replica count.
  2. Error from pod:
  Normal   Scheduled           3m4s                default-scheduler        Successfully assigned default/nextcloud-postgresql-0 to green
  Warning  FailedMount         61s                 kubelet                  Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[default-token-jcf52 data dshm]: timed out waiting for the condition
  Warning  FailedAttachVolume  56s (x9 over 3m4s)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-2f93a647-63e0-426c-8f85-ff7aaff14393" : rpc error: code = Aborted desc = The volume pvc-2f93a647-63e0-426c-8f85-ff7aaff14393 is not ready for workloads

Expected behavior I should be able to detach and reuse the volume.

Log longhorn-support-bundle_f915b513-bf53-4eec-bc4d-e00b5fa3a200_2021-03-13T21-01-07Z.zip You can also attach a Support Bundle here. You can generate a Support Bundle using the link at the footer of the Longhorn UI.

Environment:

  • Longhorn version: 1.1.0
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: v1.19.5+k3s2
    • Number of management node in the cluster: 3
    • Number of worker node in the cluster: 1
  • Node config
    • OS type and version: RasbianOS on RAspberry Pi4 8GB
    • CPU per node:
    • Memory per node: 8GB
    • Disk type(e.g. SSD/NVMe):
    • Network bandwidth between the nodes: 1Gbps
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal):
  • Number of Longhorn volumes in the cluster:

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 20 (9 by maintainers)

Most upvoted comments

This is an UI bug. We block the detach action here At https://sourcegraph.com/github.com/longhorn/longhorn-ui/-/blob/src/routes/volume/VolumeActions.js#L167

We should modify the function isRwxVolumeWithWorkload so that it only returns true when there is a workload pod that is actually currently using the volume. I.g., there exist a workload in selected.kubernetesStatus.workloadsStatus that has lastPodRefAt === ""

@cocampbe maintenance mode is equivalent with spec.disabledFrontend = true

@PhanLe1010 When you put a volume into maintenance mode, is that stored somewhere? I assumed I would see a reference to it in the volume spec, but I did not see anything. I am not sure how much work it would take, but it seems to make sense to record that a volume is in maintenance mode and use that as a flag.