rook: Bug: Disk usage not reclaimed when files are deleted

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior:

Disk space is not reclaimed.

Expected behavior:

Disk space should be reclaimed.

How to reproduce it (minimal and precise):

Check disk usage using ceph -s inside the tools pod:

sh-4.2# ceph -s
  cluster:
    id:     8c25fdd8-64e3-4d8e-8f27-0e625529d7af
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 14m)
    mgr: a(active, since 44m)
    osd: 3 osds: 3 up (since 44m), 3 in (since 44m)
 
  data:
    pools:   1 pools, 100 pgs
    objects: 322 objects, 1.1 GiB
    usage:   6.3 GiB used, 21 GiB / 27 GiB avail
    pgs:     100 active+clean
 
  io:
    client:   26 KiB/s rd, 2.7 KiB/s wr, 31 op/s rd, 0 op/s wr

exec into a container with rook pvc mount and create some files:

$ kubectl exec -it mypod sh
/container $ dd if=/dev/urandom of=sample1.txt bs=64M count=16
/container $ dd if=/dev/urandom of=sample2.txt bs=64M count=16
/container $ ls -al
-rw-r--r--    1 nobody   nogroup  536870896 Sep 11 20:38 sample.txt
-rw-r--r--    1 nobody   nogroup  536870896 Sep 11 20:41 sample2.txt

Check that disk usage has grown, using ceph -s inside the tools pod:

sh-4.2# ceph -s
  cluster:
    id:     8c25fdd8-64e3-4d8e-8f27-0e625529d7af
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 36m)
    mgr: a(active, since 65m)
    osd: 3 osds: 3 up (since 65m), 3 in (since 65m)
 
  data:
    pools:   1 pools, 100 pgs
    objects: 471 objects, 1.7 GiB
    usage:   8.1 GiB used, 19 GiB / 27 GiB avail
    pgs:     100 active+clean
 
  io:
    client:   2.7 KiB/s wr, 0 op/s rd, 0 op/s wr

Now delete those created files in the container:

$ kubectl exec -it mypod sh
/container $ rm -rf sample*.txt
/container $ ls -al
drwxr-xr-x    3 nobody   nogroup         48 Sep 11 20:42 .
drwxr-xr-x    1 root     root          4096 Sep 11 20:18 ..

Re-check disk usage using ceph -s inside the tools pod:

sh-4.2# ceph -s
  cluster:
    id:     8c25fdd8-64e3-4d8e-8f27-0e625529d7af
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 36m)
    mgr: a(active, since 65m)
    osd: 3 osds: 3 up (since 65m), 3 in (since 65m)
 
  data:
    pools:   1 pools, 100 pgs
    objects: 471 objects, 1.7 GiB
    usage:   8.1 GiB used, 19 GiB / 27 GiB avail
    pgs:     100 active+clean
 
  io:
    client:   2.7 KiB/s wr, 0 op/s rd, 0 op/s wr

The disk usage did not fall.

Ceph cluster contains 3 mons, and 3 osds backed by 10GB bluestore replicapool with replicated: 3 on ssd. Container’s pvc size is 10Gi in pvc.yaml, as large as all of the available ceph disk space.

Environment:

OS (e.g. from /etc/os-release): 18.04.1 LTS (Bionic Beaver)
Kernel (e.g. uname -a): 4.15.0-29
Cloud provider or hardware configuration: baremetal
Rook version (use rook version inside of a Rook Pod): v1.0.5
Storage backend version (e.g. for ceph do ceph -v): ceph version 14.2.1
Kubernetes version (use kubectl version): v1.15.1
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): on-prem

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 34 (15 by maintainers)

Most upvoted comments

I have fresh installation of rook version 1.3.5 (tag release-1.3) and I had the same issue with mount options on rbd

Rook version

[root@rook-ceph-operator-868bf9558-jzd4v /]# rook version
rook: v1.3.5

rook-ceph-operator-config , i have used both ROOK_CSI_CEPH_IMAGE v2.1.0 and last v2.1.2

apiVersion: v1
data:
  CSI_ENABLE_SNAPSHOTTER: "true"
  CSI_FORCE_CEPHFS_KERNEL_CLIENT: "true"
  ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "false"
  ROOK_CSI_ATTACHER_IMAGE: quay.io/k8scsi/csi-attacher:v2.2.0
  ROOK_CSI_CEPH_IMAGE: quay.io/cephcsi/cephcsi:v2.1.0
  ROOK_CSI_ENABLE_CEPHFS: "true"
  ROOK_CSI_ENABLE_GRPC_METRICS: "true"
  ROOK_CSI_ENABLE_RBD: "true"
  ROOK_CSI_PROVISIONER_IMAGE: quay.io/k8scsi/csi-provisioner:v1.6.0
  ROOK_CSI_REGISTRAR_IMAGE: quay.io/k8scsi/csi-node-driver-registrar:v1.3.0
  ROOK_CSI_RESIZER_IMAGE: quay.io/k8scsi/csi-resizer:v0.5.0
  ROOK_CSI_SNAPSHOTTER_IMAGE: quay.io/k8scsi/csi-snapshotter:v1.2.2
  ROOK_OBC_WATCH_OPERATOR_NAMESPACE: "true"
kind: ConfigMap
metadata:
  creationTimestamp: "2020-07-24T07:26:43Z"
  name: rook-ceph-operator-config
  namespace: rook-ceph
  resourceVersion: "97681505"
  selfLink: /api/v1/namespaces/rook-ceph/configmaps/rook-ceph-operator-config
  uid: 05f05fa8-cd7f-11ea-8970-525400d9280f

My storage class

storageclasses.storage.k8s.io rook-ceph-block -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2020-07-24T13:15:13Z"
  name: rook-ceph-block
  resourceVersion: "97682453"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/rook-ceph-block
  uid: b528eb01-cdaf-11ea-8970-525400d9280f
mountOptions:
- discard
parameters:
  clusterID: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/fstype: ext4
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  imageFeatures: layering
  imageFormat: "2"
  pool: replicapool
provisioner: rook-ceph.rbd.csi.ceph.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

Pod and pvc

---
apiVersion: v1
kind: Pod
metadata:
  name: csirbd-demo-pod
spec:
  containers:
   - name: web-server
     image: nginx
     volumeMounts:
       - name: mypvc
         mountPath: /var/lib/www/html
  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: rbd-pvc
       readOnly: false
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rbd-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: rook-ceph-block

Inside pod

root@csirbd-demo-pod:/# mount | grep rbd
/dev/rbd0 on /var/lib/www/html type ext4 (rw,relatime,stripe=16)
root@csirbd-demo-pod:/# df -h /var/lib/www/html/
Filesystem      Size  Used Avail Use% Mounted on
/dev/rbd0       976M  2.6M  958M   1% /var/lib/www/html
root@csirbd-demo-pod:/# fstrim /var/lib/www/html/
fstrim: /var/lib/www/html/: FITRIM ioctl failed: Operation not permitted
root@csirbd-demo-pod:/#

mountOptions - has no effect

@Madhu-1

mywkaa on Jul 24, 2020

Thanks for your response @mywkaa. I am running okd 4.5 cluster so k8s 1.18.3. I have finally upgraded to rook 1.4.4 which also upgraded ceph csi to 3.x. Also I have recreated my storage class with discard option as fstrim was not working. With all above I can see osd space is freeing up. Just not sure yet of performance impact of mounting with discard option.

PiotrKlimczak on Sep 25, 2020

This is expected. Ceph images are sparse, so if you create file storage increases. If you want to reclaim space try the fstrim command or mount the fs with discard (we might decide to do this). I think ceph-csi as fs options you can use while creating your storageclass, try to use mount_options and set discard.

@leseb I’ve just tried this on rook v1.1.0 on a cephfs StorageClass with the following mountOptions and it did successfully reclaim the space.

mountOptions:
- discard

However, I prefer rbd to cephfs so tried the same mountOptions for a rbd StorageClass but it did not work. Are you aware of any way to workaround this at the moment for rbd? Or any plan to support rbd reclaiming soon?

leojonathanoh on Sep 12, 2019

Same problems here on my test cluster, rook 1.3.11, ceph 15.2.5, csi 2.1.2.

I tried fstrim from host machine, which runs correctly and even said: 320.6 GiB (344195588096 bytes) trimmed However OSD is still showing as full- no change in space is reflected.

What are we missing?

What version of k8s are you using ? For me helped update to 1.18

mywkaa on Sep 24, 2020

@Madhu-1 @travisn I just checked the StorageClass YAMLs in rook’s repo and I have not seen the mountOptions (or similar) field in them.

I think we should add the mounOptions with discard flag as an example commented out, though I’m strongly inclined to have mountOptions: ["discard"] enabled by default on the example StorageClasses.

i was thinking its already there but it’s not present.Yes, we can add it.

Madhu-1 on Jun 16, 2020

You can specify mout option in storageclass, discard is just an example

Madhu-1 on Jun 4, 2020

https://github.com/ceph/ceph-csi/pull/848 should fix the issue for rbd.

Madhu-1 on Mar 2, 2020

Ping @Madhu-1 has this been addressed in the latest Ceph CSI release?

galexrt on Jan 22, 2020