kubernetes: PV is stuck at terminating after PVC is deleted
Is this a BUG REPORT or FEATURE REQUEST?:
Uncomment only one, leave it on its own line:
/kind bug
/kind feature
What happened:
I was testing EBS CSI driver. I created a PV using PVC. Then I deleted the PVC. However, PV deletion is stuck in Terminating
state. Both PVC and volume is deleted without any issue. CSI driver is kept being called with DeleteVolume
even if it returns success when volume not found (because it is already gone).
CSI Driver log:
I1011 20:37:29.778380 1 controller.go:175] ControllerGetCapabilities: called with args &csi.ControllerGetCapabilitiesRequest{XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}
I1011 20:37:29.780575 1 controller.go:91] DeleteVolume: called with args: &csi.DeleteVolumeRequest{VolumeId:"vol-0ea6117ddb69e78fb", ControllerDeleteSecrets:map[string]string(nil), XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}
I1011 20:37:29.930091 1 controller.go:99] DeleteVolume: volume not found, returning with success
external attacher log:
I1011 19:15:14.931769 1 controller.go:167] Started VA processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:14.931794 1 csi_handler.go:76] CSIHandler: processing VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:14.931808 1 csi_handler.go:103] Attaching "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:14.931823 1 csi_handler.go:208] Starting attach operation for "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:14.931905 1 csi_handler.go:179] PV finalizer is already set on "pvc-069128c6ccdc11e8"
I1011 19:15:14.931947 1 csi_handler.go:156] VA finalizer is already set on "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:14.931962 1 connection.go:235] GRPC call: /csi.v0.Controller/ControllerPublishVolume
I1011 19:15:14.931966 1 connection.go:236] GRPC request: volume_id:"vol-0ea6117ddb69e78fb" node_id:"i-06d0e08c9565c4db7" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_attributes:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1539123546345-8081-com.amazon.aws.csi.ebs" >
I1011 19:15:14.935053 1 controller.go:197] Started PV processing "pvc-069128c6ccdc11e8"
I1011 19:15:14.935072 1 csi_handler.go:350] CSIHandler: processing PV "pvc-069128c6ccdc11e8"
I1011 19:15:14.935106 1 csi_handler.go:386] CSIHandler: processing PV "pvc-069128c6ccdc11e8": VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53" found
I1011 19:15:14.952590 1 controller.go:197] Started PV processing "pvc-069128c6ccdc11e8"
I1011 19:15:14.952613 1 csi_handler.go:350] CSIHandler: processing PV "pvc-069128c6ccdc11e8"
I1011 19:15:14.952654 1 csi_handler.go:386] CSIHandler: processing PV "pvc-069128c6ccdc11e8": VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53" found
I1011 19:15:15.048026 1 controller.go:197] Started PV processing "pvc-069128c6ccdc11e8"
I1011 19:15:15.048048 1 csi_handler.go:350] CSIHandler: processing PV "pvc-069128c6ccdc11e8"
I1011 19:15:15.048167 1 csi_handler.go:386] CSIHandler: processing PV "pvc-069128c6ccdc11e8": VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53" found
I1011 19:15:15.269955 1 connection.go:238] GRPC response:
I1011 19:15:15.269986 1 connection.go:239] GRPC error: rpc error: code = Internal desc = Could not attach volume "vol-0ea6117ddb69e78fb" to node "i-06d0e08c9565c4db7": could not attach volume "vol-0ea6117ddb69e78fb" to node "i-06d0e08c9565c4db7": InvalidVolume.NotFound: The volume 'vol-0ea6117ddb69e78fb' does not exist.
status code: 400, request id: 634b33d1-71cb-4901-8ee0-98933d2a5b47
I1011 19:15:15.269998 1 csi_handler.go:320] Saving attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.274440 1 csi_handler.go:330] Saved attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.274464 1 csi_handler.go:86] Error processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53": failed to attach: rpc error: code = Internal desc = Could not attach volume "vol-0ea6117ddb69e78fb" to node "i-06d0e08c9565c4db7": could not attach volume "vol-0ea6117ddb69e78fb" to node "i-06d0e08c9565c4db7": InvalidVolume.NotFound: The volume 'vol-0ea6117ddb69e78fb' does not exist.
status code: 400, request id: 634b33d1-71cb-4901-8ee0-98933d2a5b47
I1011 19:15:15.274505 1 controller.go:167] Started VA processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.274516 1 csi_handler.go:76] CSIHandler: processing VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.274522 1 csi_handler.go:103] Attaching "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.274528 1 csi_handler.go:208] Starting attach operation for "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.274536 1 csi_handler.go:320] Saving attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.278318 1 csi_handler.go:330] Saved attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 19:15:15.278339 1 csi_handler.go:86] Error processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53": failed to attach: PersistentVolume "pvc-069128c6ccdc11e8" is marked for deletion
I1011 20:37:23.328696 1 controller.go:167] Started VA processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.328709 1 csi_handler.go:76] CSIHandler: processing VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.328715 1 csi_handler.go:103] Attaching "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.328721 1 csi_handler.go:208] Starting attach operation for "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.328730 1 csi_handler.go:320] Saving attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.330919 1 reflector.go:286] github.com/kubernetes-csi/external-attacher/vendor/k8s.io/client-go/informers/factory.go:87: forcing resync
I1011 20:37:23.330975 1 controller.go:197] Started PV processing "pvc-069128c6ccdc11e8"
I1011 20:37:23.330990 1 csi_handler.go:350] CSIHandler: processing PV "pvc-069128c6ccdc11e8"
I1011 20:37:23.331030 1 csi_handler.go:386] CSIHandler: processing PV "pvc-069128c6ccdc11e8": VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53" found
I1011 20:37:23.346007 1 csi_handler.go:330] Saved attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.346033 1 csi_handler.go:86] Error processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53": failed to attach: PersistentVolume "pvc-069128c6ccdc11e8" is marked for deletion
I1011 20:37:23.346069 1 controller.go:167] Started VA processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.346077 1 csi_handler.go:76] CSIHandler: processing VA "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.346082 1 csi_handler.go:103] Attaching "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.346088 1 csi_handler.go:208] Starting attach operation for "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.346096 1 csi_handler.go:320] Saving attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.351068 1 csi_handler.go:330] Saved attach error to "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53"
I1011 20:37:23.351090 1 csi_handler.go:86] Error processing "csi-3b15269e725f727786c5aec3b4da3f2eebc2477dec53d3480a3fe1dd01adea53": failed to attach: PersistentVolume "pvc-069128c6ccdc11e8" is marked for deletion
>> kk get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-069128c6ccdc11e8 4Gi RWO Delete Terminating default/claim1 late-sc 22h
>> kk describe pv
Name: pvc-069128c6ccdc11e8
Labels: <none>
Annotations: pv.kubernetes.io/provisioned-by: com.amazon.aws.csi.ebs
Finalizers: [external-attacher/com-amazon-aws-csi-ebs]
StorageClass: late-sc
Status: Terminating (lasts <invalid>)
Claim: default/claim1
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 4Gi
Node Affinity: <none>
Message:
Source:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: com.amazon.aws.csi.ebs
VolumeHandle: vol-0ea6117ddb69e78fb
ReadOnly: false
VolumeAttributes: storage.kubernetes.io/csiProvisionerIdentity=1539123546345-8081-com.amazon.aws.csi.ebs
Events: <none>
Storageclass:
kind: StorageClass
apiVersion: storage.k8s.io/v1 metadata:
name: late-sc
provisioner: com.amazon.aws.csi.ebs
volumeBindingMode: WaitForFirstConsumer
Claim:
apiVersion: v1 kind: PersistentVolumeClaim
metadata:
name: claim1
spec: accessModes:
- ReadWriteOnce
storageClassName: late-sc
resources: requests:
storage: 4Gi
What you expected to happen: After PVC is deleted, PV should be deleted along with the EBS volume (since my Reclaim Policy is delete)
How to reproduce it (as minimally and precisely as possible): Non-deterministic so far
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version
): client: v1.12.0 server: v1.12.1 - Cloud provider or hardware configuration: aws
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a
): - Install tools: cluster is set up using kops
- Others:
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 66
- Comments: 76 (23 by maintainers)
I got rid of this issue by performing the following actions:
Then I manually edited the
pv
individually and then removing thefinalizers
which looked something like this:examples removing for finalizers
kubectl patch pvc db-pv-claim -p '{"metadata":{"finalizers":null}}'
kubectl patch pod db-74755f6698-8td72 -p '{"metadata":{"finalizers":null}}'
then you can delete them
!!! IMPORTANT !!!: Read also https://github.com/kubernetes/kubernetes/issues/78106 The patch commands are a workaround and something is not working properly. Ther volumes are still attached: kubectl get volumeattachments!
Answer of @chandraprakash1392 is still valid when
pvc
stucks also inTerminating
status. You just need to edit the pvc object and removefinalizers
object.kubectl get volumeattachment
thank you, it work for me
Removing the finalizers is just a workaround. @bertinatto @leakingtapan could you help repro this issue and save detailed CSI driver and controller-manager logs?
A more handsfree option in case if someone finds useful
Finding pvc’s which are in Terminating and patching them in a loop
Check to see if anything still remains
Still happening with 1.21. pv’s stuck in terminating, after deleting corresponding pods, pv’s and volumeattachments. Storage driver is Longhorn. What worked was patching manually:
I had multiple pvcs stuck in terminating status. kubectl describe pvcname (To get pod its attached to.) kubectl patch pvc pvcname -p ‘{“metadata”:{“finalizers”:null}}’ kubectl patch pod podname -p ‘{“metadata”:{“finalizers”:null}}’ This worked in my K8S cluster
Thanks for posting these commands to get rid of the pvc
kubectl get pv
-><pvname>
kubectl patch pv <pvname> -p "{\"metadata\":{\"finalizers\":null}}"
Any real solution so far? I remember seeing this issue in the first months of 2019, and still see the patching of finalizers as the only workaround. Unfortunately, this “bug” prohibits me from automating certain stuff.
Do someone has a diagnostic about this? Any insights? After 7 months, this thing’s still on the wild and I got no more info 7 months later.
If there are pods or jobs currently using the PV, i found that deleting them solves the issue for me
Use the below command to edit the pv and then delete the finalizers object in the definiton kubectl edit pv pv-name-id
Same here, had to delete the finalizer, here’s a describe for the pv:
[root@ip-172-31-44-98 stateful]# k describe pv pvc-1c6625e2-1157-11e9-a8fc-0275b365cbce Name: pvc-1c6625e2-1157-11e9-a8fc-0275b365cbce Labels: failure-domain.beta.kubernetes.io/region=us-east-1 failure-domain.beta.kubernetes.io/zone=us-east-1a Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner pv.kubernetes.io/bound-by-controller: yes pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs Finalizers: [kubernetes.io/pv-protection] StorageClass: default Status: Terminating (lasts <invalid>) Claim: monitoring/storage-es-data-0 Reclaim Policy: Delete Access Modes: RWO Capacity: 12Gi Node Affinity: <none> Message: Source: Type: AWSElasticBlockStore (a Persistent Disk resource in AWS) VolumeID: aws://us-east-1a/vol-0a20e4f50b60df855 FSType: ext4 Partition: 0 ReadOnly: false Events: <none>
We’ve hit what seems to be the same problem using Amazon EBS (we have all the symptoms at least). Should this be raised as a new issue?
No, when this happens to my PV, the pod no longer exists.
Solved the issue of PVC and PV stucked in “terminating” state by deleting the pod using it.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Will we eventually stop playing around with these finalizers with manual edit ?
Issues go stale after 90d of inactivity. Mark the issue as fresh with
/remove-lifecycle stale
. Stale issues rot after an additional 30d of inactivity and eventually close.If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
I encountered this issue two more times now. All on v1.12