kubernetes: Storage: devicePath is empty while WaitForAttach in StatefulSets

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

I created a StatefulSet with 4 replicas and it worked correctly. After some time I needed to restart one of the pods and when it came back again it was stuck on this error:

Events:
  Type     Reason       Age               From                                   Message
  ----     ------       ----              ----                                   -------
  Warning  FailedMount  9m (x40 over 1h)  kubelet, ip-180-12-10-58.ec2.internal  Unable to mount volumes for pod "storage-0_twodotoh(51577cea-9ccd-11e8-b024-1232e142048e)": timeout expired waiting for volumes to attach or mount for pod "myorg"/"mypod-0". list of unmounted volumes=[data]. list of unattached volumes=[data mypod-config default-token-cjfqp]
  Warning  FailedMount  3m (x55 over 1h)  kubelet, ip-180-12-10-58.ec2.internal  MountVolume.WaitForAttach failed for volume "pvc-a12b7de1-30ed-11ee-a324-2232d546216c" : WaitForAttach failed for AWS Volume "aws://us-east-1b/vol-045d3gx6hg53gz341": devicePath is empty.

When that happens I can get it working by deleting the pod again, it can happen that the error happens again after that but usually not three times in a row.

The main issue is that if you don’t act manually on it it will continue to be reconciled by reconciler.go and will never come back again.

The issue seems to be on actual_state_of_world.go while doing the MarkVolumeAsAttached part, at some point the devicePath string is not written in the object.

actual_state_of_world.go:616 ->
  reconciler.go:238 ->
    operation_executor.go:712 ->
       operation_generator.go:437 -> error on line 496

What you expected to happen:

The pod comes back with no error.

How to reproduce it (as minimally and precisely as possible): The problem seems to be difficult to reproduce, I can trigger it after the Upgrading to 1.11 does not solve the problem. It does not happen on Deployments, I haven’t been able to reproduce there.

Start a Kubernetes cluster on AWS that is configured to use EBS volumes
Create a statefulset with a dynamic provisioned volume (see yaml file below)
Delete one of the pods of your choice

At this point one of two things can happen:

It just works OR
The pod is not able to come back again and give the error I reported above.

I haven’t found a reliable way to make one or the other happen when I wanted, it seems to be very random but I’m sure that it only happens when the pod is recreated on the same node.

Statefulset to reproduce

apiVersion: v1
kind: Namespace
metadata:
  name: repro-devicepath
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: myrepro
  namespace: repro-devicepath
  labels:
    component: myrepro
spec:
  serviceName: myrepro
  selector:
    matchLabels:
      component: myrepro
  replicas: 4
  template:
    metadata:
      name: myrepro
      labels:
        component: myrepro
    spec:
      containers:
        - name: myrepro
          image: docker.io/fntlnz/caturday:latest
          volumeMounts:
            - name: data
              mountPath: /data
  volumeClaimTemplates:
  - metadata:
      namespace: repro-devicepath
      name: data
    spec:
      storageClassName: ebs-1
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: 1Gi

Anything else we need to know?:

When this happens, if one looks at devicePath in the node’s status it will be reported empty, one can verify that with:

 kubectl get node -o json | jq ".items[].status.volumesAttached"

I found some other users on slack that have this problem, @wirewc sent me this (note the empty devicePath happening in his system.

 volumesAttached:
 - devicePath: ""
   name: kubernetes.io/iscsi/10.48.147.131:iqn.2016-12.org.gluster-block:b5a96cbd-926b-421f-922b-4df13ca150e0:0
 volumesInUse:
 - kubernetes.io/iscsi/10.48.147.131:iqn.2016-12.org.gluster-block:b5a96cbd-926b-421f-922b-4df13ca150e0:0
 - kubernetes.io/iscsi/pvc-bb8b444f-9a68-11e8-b661-0050569c4ace:pvc-bb8b444f-9a68-11e8-b661-0050569c4ace:0

Also @ntfrnzn detailed a similar issue here: https://github.com/packethost/csi-packet/issues/8

Environment:

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:05:37Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

I’m hitting this on a production cluster on 1.10.3 but I get the same error on a testing cluster that has 1.11

Cloud provider or hardware configuration: AWS, deployed using kubeadm
OS (e.g. from /etc/os-release):

NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1800.6.0
VERSION_ID=1800.6.0
BUILD_ID=2018-08-04-0323
PRETTY_NAME="Container Linux by CoreOS 1800.6.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Kernel (e.g. uname -a): Linux ip-180-12-0-57 4.14.59-coreos-r2 #1 SMP Sat Aug 4 02:49:25 UTC 2018 x86_64 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz GenuineIntel GNU/Linux
Install tools:
Others:

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 11
Comments: 40 (16 by maintainers)

Commits related to this issue

unit test to reproduce bug kubernetes#67342 — committed to wph95/kubernetes by deleted user 6 years ago
unit test to reproduce bug kubernetes#67342 — committed to wph95/kubernetes by deleted user 6 years ago
unit test to reproduce bug kubernetes#67342 — committed to wph95/kubernetes by wph95 6 years ago

Most upvoted comments

I opened the kubelet --v=10 in my test cluster, in our scenario (with intermittent long write operations on the disk). This bug trigger is 10%. By analyzing the logs, I found the cause of the problem and succeeded in reproducing the problem by adding a new unit test to prove that the bug existed.

The cause is that AWS EBS sometimes attacher.UnmountDevice slowly (10s ++ ), UnmountDevice is an asynchronous function and at the same time， desiredStateOfWorldPopulator update desired_state_of_world status. The cause of the mistake rc.desiredStateOfWorld.VolumeExists(ErrorVolume) == True

This has caused some code reconciler.reconcile can’t run as expected This part of the code[MarkVolumeAsDetached] is not executed, let volumeToMount never have chance execute VerifyControllerAttachedVolume to refresh drivePath to right reconcile fall into an error loop.

cc @gnufied

and i’m glad/want to contribute code. btw i think reconciler lifecyle is complex，I’ve been looking for a long time to search pr/issue about reconciler lifecyle. But I didn’t find it, so I didn’t know how to fix the bug correctly. maybe insure after executed UnmountDevice to excute MarkVolumeAsDetached

p.s. English is not my mother tongue; please excuse any errors on my part. if have not understood, please see https://github.com/WPH95/kubernetes/pull/1 or mention me 😃

+11

wph95 on Aug 26, 2018

Hi, we’re also facing the same issue. We can confirm that the unmount is left from previous pod. The difference in our scenario is that we were trying to upgrade from 1.11.1 to 1.11.2. We initally thought it had something to do with the versions. But here is what our hypothesis is:

When pod gets deleted first time, it leaves the mount behind. When we restart the kubelet, it sees an extra mount and unmounts it. After the unmount MarkDeviceAsUnmounted gets called. It marks devicePath as empty.

When the scheduler puts back the container on the kubelet, kubelet tries to mount again. But the attachedVolumes map in actualStateOfWorld has the attachedVolume object with empty devicePath. Which causes WaitForAttach to fail.

It makes me wonder why does MarkDeviceAsUnmounted have to set devicePath as empty.

/cc @benchapman

akshaymankar on Aug 16, 2018

@gnufied - We’re getting this same issue with Argo (https://github.com/argoproj/argo) as @dguendisch, see above. We’re heavily invested technically in using Kops, which is just now supporting 1.11 in beta, so we’re really looking forward to a fix being backported. (We upgraded to 1.10.12 which contains the #71074 patch but does not remove the error for us.)

DaveKriegerEmerald on Dec 26, 2018

We were hit by this issue. Dirty w/a is to restart kubelet on affected node.

r7vme on Aug 28, 2018

FYI. this issue is fixed in by #71074

k8s version	fixed version
v1.10	no fix
v1.11	1.11.7
v1.12	1.12.5
v1.13	no such issue

Shall we close this issue now?

andyzhangx on Mar 26, 2019

We ran into this in EKS, also like @mmalex

A workaround for us was to scale the deployment to 0 replicas, wait for EBS volume to unmount in EC2 console, then scale deployment back to original replicas and it mounted successfully.

max-rocket-internet on Mar 20, 2019

Seeing this issue in AWS EKS, for which the highest available version is currently 1.11.5:

https://docs.aws.amazon.com/eks/latest/userguide/platform-versions.html

A backport would be greatly appreciated.

drewhemm on Mar 11, 2019

I can confirm that this particular issue is solved in 1.12.5 – I have been running a PVC stress tester which basically waits for a StatefulSet to become stable and then rapidly evicts pods for rescheduling. I have never been able to successfully run it for more than 2 hours before running in to this issue. But with 1.12.5, I have been successfully running it for 36 hours.

rajiteh on Jan 25, 2019

I am seeing the problem on 1.11.5 | eks.1, when the pod is being recreated but its on the same node as the one it was replacing. Does not happen all the time. The current fix is for me to go into AWS and detach the volume manually, then delete the pod and it will then start up properly again.

tsmgeek on Jan 24, 2019

Hey guys I think @jsafrane fixed this issue via https://github.com/kubernetes/kubernetes/pull/71074 which makes sure that if UnmountVolume is called, we do not remove devicePath from the cache. If any folks who are facing this issue can try on 1.13 which includes aforementioned patch then we will consider backporting that patch to 1.12 and 1.11.

gnufied on Dec 14, 2018

Looks like the kube e2e is also running into this as part of pull-kubernetes-e2e-kops-aws.[sig-storage] Dynamic Provisioning DynamicProvisioner should provision storage with different parameters tests that target EBS provisioning.

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/67530/pull-kubernetes-e2e-kops-aws/103140/

Aug 24 22:25:51.337: INFO: At 2018-08-24 22:10:49 +0000 UTC - event for pvc-volume-tester-stzb2: {kubelet ip-172-20-55-168.us-west-2.compute.internal} Created: Created container
Aug 24 22:25:51.337: INFO: At 2018-08-24 22:10:50 +0000 UTC - event for pvc-volume-tester-27r57: {default-scheduler } Scheduled: Successfully assigned e2e-tests-volume-provisioning-2qbc4/pvc-volume-tester-27r57 to ip-172-20-55-168.us-west-2.compute.internal
Aug 24 22:25:51.337: INFO: At 2018-08-24 22:10:50 +0000 UTC - event for pvc-volume-tester-27r57: {kubelet ip-172-20-55-168.us-west-2.compute.internal} FailedMount: MountVolume.WaitForAttach failed for volume "pvc-7c7f5f34-a7ea-11e8-99eb-02c590dc6280" : WaitForAttach failed for AWS Volume "aws://us-west-2b/vol-0194c1ef086c2bb42": devicePath is empty.
Aug 24 22:25:51.337: INFO: At 2018-08-24 22:10:51 +0000 UTC - event for pvc-volume-tester-stzb2: {kubelet ip-172-20-55-168.us-west-2.compute.internal} SandboxChanged: Pod sandbox changed, it will be killed and re-created.
Aug 24 22:25:51.337: INFO: At 2018-08-24 22:12:53 +0000 UTC - event for pvc-volume-tester-27r57: {kubelet ip-172-20-55-168.us-west-2.compute.internal} FailedMount: Unable to mount volumes for pod "pvc-volume-tester-27r57_e2e-tests-volume-provisioning-2qbc4(8f49de8c-a7ea-11e8-99eb-02c590dc6280)": timeout expired waiting for volumes to attach or mount for pod "e2e-tests-volume-provisioning-2qbc4"/"pvc-volume-tester-27r57". list of unmounted volumes=[my-volume]. list of unattached volumes=[my-volume default-token-zxnz8]

ddebroy on Aug 24, 2018

We ran into this in EKS, also like @mmalex

A workaround for us was to scale the deployment to 0 replicas, wait for EBS volume to unmount in EC2 console, then scale deployment back to original replicas and it mounted successfully.

Small world 😄

For us, recreating the job with pause for vol detach made it work

codersofthedark on May 16, 2019

@drewhemm @mmalex the fix in https://github.com/kubernetes/kubernetes/pull/71074 had been backported to the 1.11.x release branch but didn’t reach 1.11.5, the first release that has it in the 1.11 series is 1.11.7. I see that eks.2 is on 1.11.8 so the only way to get this fixed for the two of you is to upgrade to that.

see: https://docs.aws.amazon.com/eks/latest/userguide/platform-versions.html

fntlnz on Mar 16, 2019

@fntlnz, thanks for the input! I have this issue on K8s v.1.10.7. Upgrade should be coming in the next few weeks, we’ll see if it appears again afterwards.

gtie on Oct 19, 2018