longhorn: [BUG] data V2 engine displays incorrect usage in UI
Describe the bug (🐛 if you encounter this issue)
When I create 40 block type V2 PVC (per 10G), ‘Storage Schedulable (Block)’ displays incorrect used
master1 [~]# kubectl get statefulsets.apps
NAME READY AGE
nginx-spdk-block 20/20 113m
nginx-spdk-block2 8/20 47m
To Reproduce
- Setup Longhorn V2
- Add storage block devices
- Create statefulset
Expected behavior
The “used” in ‘Storage Schedulable (Block)’ is displayed normally in the UI.
No impact on the normal use of Pods after over-quota.
Support bundle for troubleshooting
Environment
- Longhorn version: 1.5.1
- Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Rancher Catalog App
- Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: K3s
- Number of management node in the cluster: 2
- Number of worker node in the cluster: 0
- Node config
- OS type and version: k3os
- Kernel version: 5.15.0-60-generic
- CPU per node: 16
- Memory per node: 20
- Disk type(e.g. SSD/NVMe/HDD): HDD
- Network bandwidth between the nodes:
- Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): KVM
- Number of Longhorn volumes in the cluster: 42
- Impacted Longhorn resources:
- Volume names:
Additional context
I have two nodes, each node has a backend storage of 300G (one disk) that can be used by the V2 engine. The replica number of V2 SC is 2. When I create two StatefulSets, one with 20 replicas and the other with 20 replicas, both define PVCs of Block type with a size of 10G in volumeClaimTemplates. Intuitively, the used space should exceed 420G, but the actual value for used is only 57.2G.
Running Pods have always remained at more than 20
master1 [~]# kubectl get statefulsets.apps
NAME READY AGE
nginx-spdk-block 10/20 119m
nginx-spdk-block2 10/20 53m
After waiting for a period of time, all V2 volumes are not healthy.
statefulsets as follows:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nginx-spdk-block
spec:
selector:
matchLabels:
app: nginx-spdk-block
podManagementPolicy: Parallel
replicas: 20
volumeClaimTemplates:
- metadata:
name: html
spec:
volumeMode: Block
accessModes:
- ReadWriteOnce
storageClassName: longhorn-v2-data-engine
resources:
requests:
storage: 10Gi
template:
metadata:
labels:
app: nginx-spdk-block
spec:
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
containers:
- name: nginx
image: fiotest:0.1.0 # Image contains fio
imagePullPolicy: Always
securityContext:
privileged: true
volumeDevices:
- devicePath: "/dev/sdd"
name: html
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Comments: 19 (11 by maintainers)
This ticket is an error ticket, and I am a beginner. I apologize for any inconvenience caused. I misunderstood it as an “excessive usage” issue because I couldn’t create multiple V2 PVCs. Also, the UI interface displays the actual storage usage after thin-provisioned.
To summarize the issues I have encountered while using version 1.5.1 of Longhorn:
nvme list segmentation fault: The current solution is to recompile instance-manager and compile and install version1.16of nvme-cli during image creation (the instance-manager pod needs/sysrwpermissions). refer: https://github.com/longhorn/longhorn/issues/6795#issuecomment-1736624780 & https://github.com/longhorn/go-spdk-helper/commit/579110a4706cec040d05c8ff6f9433641f95c663json cannot unmarshal MaximumLBA type: The current solution is to recompileinstance-managerand replacego-help-spdkin thego.modfile. refer: https://github.com/longhorn/go-spdk-helper/commit/e5fe21b6067f1adaad483b72409fe05b849c7503Unable to create multiple V2 PVCs: The solution is to set a larger hugepage memory for both the host and the instance-manager pod of longhorn to support the creation of more V2 PVCs. refer: https://github.com/longhorn/longhorn/discussions/6493#discussioncomment-6819522I want to express my sincere thanks to @DamiaSan and @derekbit for their assistance.
Hello @DamiaSan FYR https://github.com/longhorn/longhorn/discussions/6493#discussioncomment-6819522