secrets-store-csi-driver-provider-azure: MountVolume.SetUp failed write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe

What steps did you take and what happened: Upgraded from chart version 0.0.13 to 0.0.14. No issues the first couple of days. But now during a helm deployment, it’s trying to get secretproviderclass, but it’s unable to get it and times out which leads to failing the deployment.

What did you expect to happen: deploy without issues

Anything else you would like to add: it doesn’t seem consistent, sometimes it takes 2 times to redeploy to work around this issue and sometimes it requires more than 2 times to redeploy.

Which access mode did you use to access the Azure Key Vault instance: Service Principal

Environment: dev/sit/uat

Logs from:

  • kubectl describe po -n xxx-xxx-xxx-dev xxx-xxx-xxx-dev-job-db-schema-operations-pre-qf6dq | grep Failed: Warning FailedMount 55m (x13 over 98m) kubelet Unable to attach or mount volumes: unmounted volumes=[secrets-store], unattached volumes=[secrets-store configuration default-token-hmkzz]: timed out waiting for the condition Warning FailedMount 19m (x10 over 92m) kubelet Unable to attach or mount volumes: unmounted volumes=[secrets-store], unattached volumes=[configuration default-token-hmkzz secrets-store]: timed out waiting for the condition Warning FailedMount 10m (x6 over 83m) kubelet Unable to attach or mount volumes: unmounted volumes=[secrets-store], unattached volumes=[default-token-hmkzz secrets-store configuration]: timed out waiting for the condition Warning FailedMount 5m15s (x55 over 100m) kubelet MountVolume.SetUp failed for volume “secrets-store” : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Unknown desc = failed to get secretproviderclass xxx-xxx-xxx-dev/secret-store, error: Get https://172.17.168.1:443/apis/secrets-store.csi.x-k8s.io/v1alpha1/namespaces/xxx-xxx-xxx-dev/secretproviderclasses/secret-store: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe
  • kubectl get secretproviderclasses.secrets-store.csi.x-k8s.io -n xxx-xxx-xxx-dev: NAME AGE secret-store 5d8h
  • kubectl logs -n csi csi-secrets-store-provider-azure-1605689668-secrets-store-9s8wx secrets-store | grep xxx-xxx-xxx-dev | tail -n 10: E1124 16:55:48.018047 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-cron-5f74bdb9c8-zg46f.164a6b5779bcf0b4: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:55:58.018589 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-cron-5f74bdb9c8-zg46f.164a6b5779bcf0b4: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:55:58.018642 1 event.go:218] Unable to write event ‘&v1.Event{TypeMeta:v1.TypeMeta{Kind:“”, APIVersion:“”}, ObjectMeta:v1.ObjectMeta{Name:“xxx-xxx-xxx-dev-cron-5f74bdb9c8-zg46f.164a7a30b6515262”, GenerateName:“”, Namespace:“xxx-xxx-xxx-dev”, SelfLink:“”, UID:“”, ResourceVersion:“”, Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:“”, ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:“Pod”, Namespace:“xxx-xxx-xxx-dev”, Name:“xxx-xxx-xxx-dev-cron-5f74bdb9c8-zg46f”, UID:“339855dc-afeb-4b64-bee4-8d654dd1539f”, APIVersion:“v1”, ResourceVersion:“22089452”, FieldPath:“”}, Reason:“FailedToCreateSecret”, Message:“failed to get mounted files, err: failed to list all files in target path /var/lib/kubelet/pods/339855dc-afeb-4b64-bee4-8d654dd1539f/volumes/kubernetes.io~csi/secrets-store/mount, err: open /var/lib/kubelet/pods/339855dc-afeb-4b64-bee4-8d654dd1539f/volumes/kubernetes.io~csi/secrets-store/mount: no such file or directory”, Source:v1.EventSource{Component:“csi-secrets-store-controller”, Host:“”}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbfe76669ac835662, ext:540880034792846, loc:(*time.Location)(0x272ec80)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbfe76669ac835662, ext:540880034792846, loc:(*time.Location)(0x272ec80)}}, Count:1, Type:“Warning”, EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:“”, Related:(*v1.ObjectReference)(nil), ReportingController:“”, ReportingInstance:“”}’ (retry limit exceeded!) E1124 16:55:58.019152 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-app-898545db9-szbqw.164a6b57b564bb80: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:56:00.082175 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-app-898545db9-szbqw.164a6b57b564bb80: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:56:10.082834 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-app-898545db9-szbqw.164a6b57b564bb80: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:56:20.083179 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-app-898545db9-szbqw.164a6b57b564bb80: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:56:30.083602 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-app-898545db9-szbqw.164a6b57b564bb80: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:56:40.084140 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-app-898545db9-szbqw.164a6b57b564bb80: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping) E1124 16:56:50.084711 1 event.go:273] Unable to write event: ‘Patch https://172.17.168.1:443/api/v1/namespaces/xxx-xxx-xxx-dev/events/xxx-xxx-xxx-dev-app-898545db9-szbqw.164a6b57b564bb80: write tcp 172.17.0.1:58904->172.17.168.1:443: write: broken pipe’ (may retry after sleeping)
  • kubectl get svc kubernetes: NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 172.17.168.1 <none> 443/TCP 61d

Specs: Helm Chart Version: 0.0.14 Secrets Store CSI Driver version: (use the image tag): mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.0.17 Azure Key Vault provider version: (use the image tag): mcr.microsoft.com/oss/azure/secrets-store/provider-azure:0.0.10 Kubernetes version: (use kubectl version and kubectl get nodes -o wide): v1.17.9 Cluster type: (e.g. AKS, aks-engine, etc): AKS

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

@aramase I have encountered the same issue just now. The resolution was to force delete the CSI pods and delete and install the Helm chart again. The below is from the CSI secrets-store logs:

time="2020-12-18T02:08:35Z" level=debug msg="GRPC request: {\"target_path\":\"/var/lib/kubelet/pods/5ed46222-93f3-4b64-88c8-891d22104807/volumes/kubernetes.io~csi/secret/mount\",\"volume_id\":\"csi-6c722ccc3b46dbb318533d4addb7731b752d8f3f87750cd61c3ecab88aaf33ec\"}"
W1218 02:08:35.299117       1 mount_helper_common.go:65] Warning: "/var/lib/kubelet/pods/5ed46222-93f3-4b64-88c8-891d22104807/volumes/kubernetes.io~csi/secret/mount" is not a mountpoint, deleting
time="2020-12-18T02:08:35Z" level=error msg="error cleaning and unmounting target path /var/lib/kubelet/pods/5ed46222-93f3-4b64-88c8-891d22104807/volumes/kubernetes.io~csi/secret/mount, err: remove /var/lib/kubelet/pods/5ed46222-93f3-4b64-88c8-891d22104807/volumes/kubernetes.io~csi/secret/mount: directory not empty for pod: 5ed46222-93f3-4b64-88c8-891d22104807"
time="2020-12-18T02:08:35Z" level=error msg="GRPC error: rpc error: code = Internal desc = remove /var/lib/kubelet/pods/5ed46222-93f3-4b64-88c8-891d22104807/volumes/kubernetes.io~csi/secret/mount: directory not empty"

And the pod logs:

MountVolume.SetUp failed for volume "app-secret" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Unknown desc = failed to get secretproviderclass app/app-spc, error: Get https://10.172.26.1:443/apis/secrets-store.csi.x-k8s.io/v1alpha1/namespaces/app/secretproviderclasses/app-spc: write tcp 172.26.8.106:59578->10.172.26.1:443: write: broken pipe 

UPDATE: Hope this additional info helps. We are running different CSI charts on production and pre-production clusters:

production

NAME                            	NAMESPACE              	REVISION	UPDATED                                	STATUS  	CHART                                  	APP VERSION
csi-secrets-store-provider-azure	csi                    	1       	2020-08-30 21:13:03.912034865 +0000 UTC	deployed	csi-secrets-store-provider-azure-0.0.10	0.0.8

pre-production

NAME                            	NAMESPACE              	REVISION	UPDATED                                	STATUS  	CHART                                  	APP VERSION
csi-secrets-store-provider-azure	csi                    	3       	2020-12-03 06:47:52.953026192 +0000 UTC	deployed	csi-secrets-store-provider-azure-0.0.13	0.0.9

Only the pre-production AKS cluster suffered from this issue. I have deployed the recent version for secret rotation feature.