kubernetes: Unable to bind another pvc to a volume after original pvc is deleted
What happened: When I have a volume and a pvc, and then delete the pvc, the volume is “Released”, but the ClaimRef never gets removed from the volume, and it seems like this prevents me from ever binding another claim to this volume.
What you expected to happen: After I delete a pvc that is bound to a volume, I expect that the ClaimRef would no longer have the deleted pvc’s information and that I should be able to bind a difference pvc to the volume.
How to reproduce it (as minimally and precisely as possible): First, create a volume and pvc bound to it
$ kubectl apply -f - << EOF
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: sc-test
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: Immediate
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: pv-test
spec:
storageClassName: sc-test
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
local:
path: /tmp/pv-test
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-worker-1
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-test
spec:
storageClassName: sc-test
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
EOF
Then delete the pvc
$ kubectl delete pvc pvc-test
Check the volume (it looks fine, status is Released)…
$ kubectl get pv pv-test
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-test 2Gi RWO Retain Released default/pvc-test sc-test 9s
Check the ClaimRef field, and it still has the information from pvc-test, which was deleted…
$ kubectl get pv pv-test -o=json | jq .spec.claimRef
{
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"name": "pvc-test",
"namespace": "default",
"resourceVersion": "631479",
"uid": "3c536cd4-656d-454d-b69a-8343e42f5d4b"
}
I waited around a while, thinking maybe the garbage collector or some other mechanism might clean this up, but it doesn’t seem to.
Then, if I try to bind a different claim to the volume, it gets stuck in Pending status…
kubectl apply -f - << EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-test2
spec:
storageClassName: sc-test
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
EOF
Check after a while…
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-test2 Pending sc-test 5m30s
Check the ClaimRef field again, and it is still referencing the original claim, pvc-test, which was deleted…
$ kubectl get pv pv-test -o=json | jq .spec.claimRef
{
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"name": "pvc-test",
"namespace": "default",
"resourceVersion": "631479",
"uid": "3c536cd4-656d-454d-b69a-8343e42f5d4b"
}
Anything else we need to know?: I found these similar issues from 4 years ago:
- https://github.com/kubernetes/kubernetes/issues/20753
- https://github.com/kubernetes/kubernetes/issues/27164
But I don’t think it was ever really fixed.
The workaround suggested in one of those issues was to patch the pv like this:
kubectl patch pv pv-test -p '{"spec":{"claimRef": null}}'
…but that doesn’t seem like an ideal solution long-term.
I also suspect/wonder if this problem might be responsible for some of the storage e2e test flakes that have been occurring, where pods never become ready. I ran into this while trying to troubleshoot e2e flakes, but it is hard to know for sure if this is the cause. The symptoms are similar though (timeout).
Environment:
- Kubernetes version (use
kubectl version):
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:58:59Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.0-beta.1-55-gb8b4186a14045a.dev-1591390939", GitCommit:"b8b4186a14045ab66b150b5a92276d02b8a73a3e", GitTreeState:"clean", BuildDate:"2020-06-05T21:02:50Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration: On-prem self-hosted cluster, but I also tested using kind
- OS (e.g:
cat /etc/os-release):
NAME="Ubuntu"
VERSION="20.04 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
- Kernel (e.g.
uname -a):
Linux k8s-master 5.4.0-33-generic #37-Ubuntu SMP Thu May 21 12:53:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
- Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 23 (10 by maintainers)
I have experienced the same issue and have discovered a temporary solution by deleting the problematic Persistent Volume (PV) resource and reapplying it. By doing so, the claimRef is cleared and can be utilized by a new Persistent Volume Claim (PVC). I agree that a more permanent solution should be implemented to effectively clear the claimRef.