kubernetes: Extremely slow detach/attach via controller-manager (csi_attacher)
What happened:
Hi, after upgrade to 1.17 we started having the next trouble, the pod migration is taking ages if any csi-volume attached. Example:
controller-manager v1.16.4 (17 seconds):
I1215 10:26:19.538523 1 event.go:274] Event(v1.ObjectReference{Kind:"StatefulSet", Namespace:"hosting", Name:"hc1-md42-s1", UID:"7a67a484-39ac-11e9-bffa-001999d764e2", APIVersion:"apps/v1", ResourceVersion:"2118927898", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' create Pod hc1-md42-s1-0 in StatefulSet hc1-md42-s1 successful
W1215 10:26:19.735797 1 reconciler.go:328] Multi-Attach error for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") from node "m11c15" Volume is already exclusively attached to node m8c26 and can't be attached to another
I1215 10:26:19.735933 1 event.go:274] Event(v1.ObjectReference{Kind:"Pod", Namespace:"hosting", Name:"hc1-md42-s1-0", UID:"b85722d3-4888-41ff-bb1b-681082a6ba97", APIVersion:"v1", ResourceVersion:"2118928435", FieldPath:""}): type: 'Warning' reason: 'FailedAttachVolume' Multi-Attach error for volume "hosting-vol-data-mariadb-hc1-md42-1" Volume is already exclusively attached to one node and can't be attached to another
I1215 10:26:33.510132 1 operation_generator.go:1474] Verified volume is safe to detach for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") on node "m8c26"
I1215 10:26:35.368314 1 operation_generator.go:558] DetachVolume.Detach succeeded for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") on node "m8c26"
I1215 10:26:35.388870 1 reconciler.go:288] attacherDetacher.AttachVolume started for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") from node "m11c15"
I1215 10:26:36.458511 1 operation_generator.go:390] AttachVolume.Attach succeeded for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") from node "m11c15"
I1215 10:26:36.458643 1 event.go:274] Event(v1.ObjectReference{Kind:"Pod", Namespace:"hosting", Name:"hc1-md42-s1-0", UID:"b85722d3-4888-41ff-bb1b-681082a6ba97", APIVersion:"v1", ResourceVersion:"2118928435", FieldPath:""}): type: 'Normal' reason: 'SuccessfulAttachVolume' AttachVolume.Attach succeeded for volume "hosting-vol-data-mariadb-hc1-md42-1"
controller-manager v1.17.0 (6 minutes):
I1214 22:43:30.331528 1 event.go:281] Event(v1.ObjectReference{Kind:"StatefulSet", Namespace:"hosting", Name:"hc1-md42-s1", UID:"7a67a484-39ac-11e9-bffa-001999d764e2", APIVersion:"apps/v1", ResourceVersion:"2114111129", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' create Pod hc1-md42-s1-0 in StatefulSet hc1-md42-s1 successful
W1214 22:43:30.513481 1 reconciler.go:328] Multi-Attach error for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") from node "m11c15" Volume is already exclusively attached to node m8c29 and can't be attached to another
I1214 22:43:30.513592 1 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"hosting", Name:"hc1-md42-s1-0", UID:"4f865b89-49ba-4565-9285-7e6ed1e0ed14", APIVersion:"v1", ResourceVersion:"2114111218", FieldPath:""}): type: 'Warning' reason: 'FailedAttachVolume' Multi-Attach error for volume "hosting-vol-data-mariadb-hc1-md42-1" Volume is already exclusively attached to one node and can't be attached to another
I1214 22:44:35.197812 1 reconciler.go:229] attacherDetacher.DetachVolume started for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") on node "m8c29"
I1214 22:45:19.493975 1 operation_generator.go:1297] Verified volume is safe to detach for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") on node "m8c29"
E1214 22:46:41.643616 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
E1214 22:46:41.693657 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
E1214 22:46:41.743549 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
E1214 22:46:41.793700 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
I1214 22:47:09.893665 1 operation_generator.go:470] DetachVolume.Detach succeeded for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") on node "m8c29"
I1214 22:47:09.924794 1 reconciler.go:288] attacherDetacher.AttachVolume started for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") from node "m11c15"
E1214 22:48:55.443323 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
E1214 22:48:55.493130 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
E1214 22:48:55.543248 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
E1214 22:48:55.892983 1 csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not found
I1214 22:49:26.043773 1 operation_generator.go:359] AttachVolume.Attach succeeded for volume "hosting-vol-data-mariadb-hc1-md42-1" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^hosting-vol-data-mariadb-hc1-md42-1") from node "m11c15"
I1214 22:49:26.043872 1 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"hosting", Name:"hc1-md42-s1-0", UID:"4f865b89-49ba-4565-9285-7e6ed1e0ed14", APIVersion:"v1", ResourceVersion:"2114111218", FieldPath:""}): type: 'Normal' reason: 'SuccessfulAttachVolume' AttachVolume.Attach succeeded for volume "hosting-vol-data-mariadb-hc1-md42-1"
Due backoff loop the container is starting only after 10 minutes.
What you expected to happen:
- controller-manager detaches volume from the old node immideately after the pod is removed and attaches it to new one immediately after detaching.
- have no these messages:
csi_attacher.go:202] kubernetes.io/csi: attacher.VolumesAreAttached failed for attach.ID=csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175: volumeattachments.storage.k8s.io "csi-27bdc8eefc86cffa9c3b14fabe1e89d2e00a78fd37a0e08c48ef91740aec0175" not
How to reproduce it (as minimally and precisely as possible):
- Create StatefulSet
- Run
kubectl get pod -o wide | grep <statefulsetname>
for find the node where is it running kubectl cordon <node>
kubectl delete <podname>
- Run
kubectl get pod -o wide | grep <statefulsetname>
for find the new node is it running
Container will stuck for 10 minutes in ContainerCreating state
You can also check kubectl get volumeattachments.storage.k8s.io -w
for track detaching/attaching process.
Anything else we need to know?:
Yes, we have two clusters, staging cluster have no this problem unlike production one.
Environment:
- Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:12:17Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
- CSI version:
quay.io/k8scsi/csi-provisioner:v1.4.0
quay.io/k8scsi/csi-attacher:v2.0.0
quay.io/k8scsi/csi-snapshotter:v1.2.2
quay.io/k8scsi/csi-node-driver-registrar:v1.2.0
-
Cloud provider or hardware configuration: Bare Metal
-
OS: Ubuntu 18.04.2 LTS
-
Kernel
4.15.18-18-pve
-
Install tools: kubeadm
-
CSI plugin and version:
linstor-csi v0.7.2
-
Others:
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 5
- Comments: 47 (26 by maintainers)
Seeing the issue on our setup too with k8s 1.17 and CSI volumes
Following logs are seen in the controller manager
In progress. Planning to submit to 1.19 and then backport to earlier releases.
I’m now more sure then ever that this is in fact a bug in controller-manager. See the attached image which shows the number of gorotines as reported by the controller-manager via prometheus metrics:
As you can see here the number of go-routines is constantly growing over time. At the end of the graph you can see what happens if i restart the leading controller manager. One of the remaining controllers takes over and immediately starts to pile up go-routines. I also enable profiling and pulled the following stack dump:
pprof-goroutines.dump.gz
@jsafrane volume attachment is working fine with controller-manager v1.16.4 and very slow controller-manager v1.17.0, what the difference?
No, CSI driver is detaching volume almost immediately if do
kubectl delete volumeattachment
manually. The same about the attaching, driver is handling new volumeattachment requests quite fast.The problem is that controller-manager does not creating these requests in a time.
This change to client-go that enabled throttling for erroneously unthrottled clients probably explains why we see increased throttling of controller-manager in 1.17: https://github.com/kubernetes/kubernetes/pull/86168
I think the fix is the same. We should change the VerifyVolumesAttached() method to use informers instead of doing a GET on every volumeattachment object every minute. @yuga711 is working on a fix.
@msau42, thanks for pointing I have 803 PVs, 588 of them are CSI
If I downgrade controller-manager to v1.16.9, reattachment is working fine
cc @davidz627 I believe there were some changes around the verified volumes attached area recently.
Anybody tried to confirm the issue at least?