harvester: [BUG] VM with unschedule disks doesn't show clear warning message

Describe the bug VM with unschedule disks doesn’t show clear warning message

To Reproduce Steps to reproduce the behavior:

  1. Create a VM that exceed the storage capacity, e.g. 100T
  2. VM failed to start, but no error message giving to the user
  3. The only available information in the event is
Reason Resource Date
FailedMount Pod virt-launcher-test-nzg65Unable to attach or mount volumes: unmounted volumes=[disk-0], unattached volumes=[container-disks sockets disk-0 cloudinitdisk-ndata public ephemeral-disks hotplug-disks libvirt-runtime cloudinitdisk-udata private]: timed out waiting for the condition 1.3 mins ago
FailedAttachVolume Pod virt-launcher-test-nzg65AttachVolume.Attach failed for volume “pvc-db775cbb-a5a2-4479-83f6-d7d9af85bfb7” : rpc error: code = Aborted desc = volume pvc-db775cbb-a5a2-4479-83f6-d7d9af85bfb7 is not ready for workloads 1.3 mins ago
FailedMount Pod virt-launcher-test-nzg65Unable to attach or mount volumes: unmounted volumes=[disk-0], unattached volumes=[ephemeral-disks private libvirt-runtime cloudinitdisk-udata public container-disks hotplug-disks cloudinitdisk-ndata disk-0 sockets]: timed out waiting for the condition

Expected behavior VM should show similar warning information to the user like insufficent storage

Support bundle

Environment:

  • Harvester ISO version: v1.0.0
  • Underlying Infrastructure (e.g. Baremetal with Dell PowerEdge R630): any

Additional context Add any other context about the problem here.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 24 (18 by maintainers)

Commits related to this issue

Most upvoted comments

Another related issue https://github.com/harvester/harvester/issues/1346

It looks, we need to add kind of enhancement to reflect the real status of PVC/PV, and control the usage from UI. @johnliu55tw @WuJun2016 please also take a look of 1#695

On the longhorn side we should prevent the volume creation, if there is no node capable of hosting it during the api volume creation call. We could return the Out of Range error, for this case during the CreateVolume csi call. This evaluation should be done by the backend api creation call.

ref: https://github.com/container-storage-interface/spec/blob/master/spec.md#createvolume-errors

Verified fixed on master-b0d883ce-head (11/22). Close this issue.

Result

Case 1 (PASS)

  1. When we create volume exceed the max acceptable size, will block and prompt message Exceed maximum size 999999999 Gi! image

Case 2 (PASS)

  1. When we create a volume did not exceed but larger than the total disk capability (plus overcommit),
  2. We can create the volume but the status of the volume is NotReady
  3. The error message should be insufficient storage image
  4. The link on the top of the page will be visible image
  5. Click the link to open a new tab with embedded longhorn UI image

Case 3 single node (PASS)

When we create a vm have os image volume with replica scheduling failed and another volume with insufficient disk

  1. The status of the volume should be Degraded and Not ready
  2. The error message of the volume should be replica scheduling failed and insufficient disk image
  3. The link in the volume section will be visible

Case 4 multiple nodes (PASS)

Same result as single node but in multi nodes cluster image

Test Information

  • Test Environment: 3 nodes harvester on bare machines and 1 nodes local kvm machine
  • Harvester version:master-b0d883ce-head (11/22)

Verify Steps

Case 1

  1. Go to create a volume with 9999999999 Gi storage size in both harvester-longhorn and longhorn storage class
  2. The volume should not be created successfully.
  3. The error message should be Exceed maximum size 999999999 Gi!

Case 2

  1. Go to create a volume with 99999999 Gi storage size in both harvester-longhorn and longhorn storage class
  2. Back to volume list
  3. The status of the volume should be NotReady
  4. The error message should be insufficient storage
  5. Go to preference page and enable DEV mode
  6. Go to volume detail page
  7. The link on the top of the page will be visible
  8. Click the link to open a new tab with embedded longhorn UI

Case 3 (Single node)

  1. Go to create a vm
  2. Give os image volume 30Gi and add another volume 1600Gi of harvester-longhorn storage class image
  3. Go to vm detail or vm edit page
  4. Switch to volume tab
  5. The status of the volume should be Degraded and Not ready
  6. The error message of the volume should be replica scheduling failed and insufficient disk image
  7. The link in the volume section will be visible

Case 4 (multi nodes)

  1. Disable Node scheduling on node 3 in Longhorn UI image image

  2. Give os image volume 30Gi and add another volume 1600Gi of harvester-longhorn storage class image

  3. OS image disk display degraded while another disk display Not ready with corresponding error image