kubernetes: Unable to attach or mount volumes: timed out waiting for the condition

What happened?

PVC mount gets timed out when the number of parallel pod requests with pvc access goes past 400. This caused pod PVC mount retries(several of these in some cases.) and delays the pod startup.

3m1s Warning FailedMount pod/nginx-deployment-7c54456f-2sk89 Unable to attach or mount volumes: unmounted volumes=[stresstest-pvc kube-api-access-f2nnp], unattached volumes=[stresstest-pvc kube-api-access-f2nnp]: timed out waiting for the condition

What did you expect to happen?

PVC should get mounted on the pods without any failedMount messages.

How can we reproduce it (as minimally and precisely as possible)?

The issue can be easily reproduced on a 10 node cluster with Kubernetes deployment having 800 pod replicas asking to mount the same PVC.

Anything else we need to know?

This issue is related to an old bug 84169 that was closed saying the issue got fixed with k8s 1.17+ but it seems the issue still persists.

Kubernetes version

$ kubectl version
# paste output here
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:41:01Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.8", GitCommit:"7061dbbf75f9f82e8ab21f9be7e8ffcaae8e0d44", GitTreeState:"clean", BuildDate:"2022-03-16T14:04:34Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

onprem installation with kubespray

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
Ubuntu 20.04
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, …) and versions (if applicable)

About this issue

Original URL
State: open
Created 2 years ago
Reactions: 18
Comments: 45 (3 by maintainers)

Most upvoted comments

We performed the fix from this blog post, which worked for us: https://blog.devgenius.io/when-k8s-pods-are-stuck-mounting-large-volumes-2915e6656cb8

+14

timvandruenen on Oct 12, 2022

We started to encounter this after upgrade to eks 1.23, using a gp2 storageclass to host Victoria Metrics. Tried @ timvandruenen fix, didn’t work. Also tried to upgrade add-ons to the latest version, same.

Resolved: in our case we lacked an iam policy: arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy.

ivankovnatsky on Jan 21, 2023

I found the same error while upscaling nodes for airflow processing 😢

LucaSoato on Jul 19, 2022

@Jeaniowang Why say so many “报告收到”? This has contaminated my mailing list.

bluemiaomiao on Dec 11, 2023

报告收到

Jeaniowang on Dec 11, 2023

I found the same error while upscaling nodes for airflow processing 😢

Yes, we are running k8s for Airflow too. The problem just slows up Airflow task init so much 😦(

NickYadance on Aug 8, 2022

报告收到

Jeaniowang on Nov 17, 2023

报告收到

Jeaniowang on Sep 8, 2023

报告收到

Jeaniowang on Apr 21, 2023

报告收到

Jeaniowang on Jan 20, 2023

报告收到

Jeaniowang on Dec 15, 2022

报告收到

Jeaniowang on Aug 8, 2022

报告收到

Jeaniowang on Oct 14, 2023

报告收到

Jeaniowang on Aug 2, 2023

报告收到

Jeaniowang on Oct 12, 2022

报告收到

Jeaniowang on Aug 30, 2022

I think I have the same issue here in my AKS cluster running Kube v1.21.7 (I know it’s already eol, I still have to schedule an update)

      volumes:
        - name: secrets
          secret:
            secretName: secret-appsettings

Running kubectl get events -n my-namespace --sort-by='.metadata.creationTimestamp'

I get this output

 LAST SEEN   TYPE      REASON        OBJECT                                                     MESSAGE
7m18s       Warning   FailedMount   pod/my-namespace-engine-api-deployment-595bf79b79-h27xj   MountVolume.SetUp failed for volume "secrets" : secret "secret-appsettings" not found
11m         Warning   FailedMount   pod/my-namespace-engine-api-deployment-595bf79b79-h27xj   Unable to attach or mount volumes: unmounted volumes=[secrets], unattached volumes=[secrets kube-api-access-f2mf8]: timed out waiting for the condition

The first message makes me think I’ve made some mistake somewhere and I’m still figuring it out but the second is identical to the one stated by op. I’ll be following this thread and giving you updates whether I can solve the issue or if is related to this at all 👍

EDIT: it’s happening on way less pods than OP, I’m pretty sure that mine is an unrelated problem at this point 🤔

EDIT2: Solved! Secret was created in the default namespace instead of the one I needed it in.

mbianchidev on Aug 17, 2022

We performed the fix from this blog post, which worked for us: https://blog.devgenius.io/when-k8s-pods-are-stuck-mounting-large-volumes-2915e6656cb8

Adding another data point here. From the article, it seems like the solution is applicable for containers running as non-root users. The issue exists even if the container is run as a root user.

ravinaik-houzz on Jan 20, 2023