kubernetes: Second Pod Using Same PVC Fails WaitForAttach Flakily

The test [sig-storage] CSI Volumes CSI plugin test using CSI driver: hostPath should provision storage flakes very occasionally. Below are the findings of the circumstances of the flake.

During testDynamicProvisioning two pods are created in quick succession. The test flakes when the second Pod fails to come up.

The second pod fails to come up with error: MountVolume.WaitForAttach failed for volume "pvc-bc2294b873ef11e8" : resource name may not be empty

Upon exploring the logs we discovered that there was only 1 Attach call made at the beginning, and one Detach call upon timeout of the test [See timestamps]:

I0613 19:00:41.888473       1 reconciler.go:291] attacherDetacher.AttachVolume started for volume "pvc-0c896e426f3c11e8" (UniqueName: "kubernetes.io/csi/csi-hostpath^0da20430-6f3c-11e8-a80a-0a580a40016c") from node "e2e-40645-674b9-minion-group-gz10" 
I0613 19:00:42.533343       1 operation_generator.go:335] AttachVolume.Attach succeeded for volume "pvc-0c896e426f3c11e8" (UniqueName: "kubernetes.io/csi/csi-hostpath^0da20430-6f3c-11e8-a80a-0a580a40016c") from node "e2e-40645-674b9-minion-group-gz10" 
I0613 19:15:52.528902       1 reconciler.go:233] attacherDetacher.DetachVolume started for volume "pvc-0c896e426f3c11e8" (UniqueName: "kubernetes.io/csi/csi-hostpath^0da20430-6f3c-11e8-a80a-0a580a40016c") on node "e2e-40645-674b9-minion-group-gz10" 
I0613 19:15:52.615262       1 operation_generator.go:1225] Verified volume is safe to detach for volume "pvc-0c896e426f3c11e8" (UniqueName: "kubernetes.io/csi/csi-hostpath^0da20430-6f3c-11e8-a80a-0a580a40016c") on node "e2e-40645-674b9-minion-group-gz10" 
I0613 19:15:53.367487       1 operation_generator.go:419] DetachVolume.Detach succeeded for volume "pvc-0c896e426f3c11e8" (UniqueName: "kubernetes.io/csi/csi-hostpath^0da20430-6f3c-11e8-a80a-0a580a40016c") on node "e2e-40645-674b9-minion-group-gz10"

We tracked down the resource name may not be empty error to the likely culprit in func waitForVolumeAttachmentInternal:

attach, err := c.k8s.StorageV1beta1().VolumeAttachments().Get(attachID, meta.GetOptions{})

based on the error message returned.

This is likely caused by attachID(string) being empty [uncomfirmed], which is what the Kubelet would get from the DevicePath field on the Node object. This DevicePath field is normally added on Attach and removed on Detach of the volume.

The DevicePath is clearly present on the Node when the first pod is being brought up. But it is possible that it is somehow missing or not able to be retrieved by the time the second Pod is starting up (and continues to be unavailable exhausting the 15m timeout on the test) even though no Detach operation was performed on the volume in the meantime (according to logs). We are currently unable to reproduce this flake on command and did all analysis based on logs from the two flakes we saw naturally occur in the 100s of runs performed of the test.

This flake was originally seen in #65064 and the test logs can be found here: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/65064/pull-kubernetes-e2e-gce/40645/

Environment: GCE

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.6", GitCommit:"6260bb08c46c31eea6cb538b34a9ceb3e406689c", GitTreeState:"clean", BuildDate:"2017-12-21T06:34:11Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.0-alpha.0.975+98194b8630958e", GitCommit:"98194b8630958ee48db662f42404dc0bc6a2ed80", GitTreeState:"clean", BuildDate:"2018-06-19T18:16:46Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration: GCP
OS (e.g. from /etc/os-release):

BUILD_ID=10323.64.0
NAME="Container-Optimized OS"
KERNEL_COMMIT_ID=94bdaa2af49562e7aba0168cae050911d105f5cb
GOOGLE_CRASH_ID=Lakitu
VERSION_ID=65
BUG_REPORT_URL=https://crbug.com/new
PRETTY_NAME="Container-Optimized OS from Google"
VERSION=65
GOOGLE_METRICS_PRODUCT_ID=26
HOME_URL="https://cloud.google.com/compute/docs/containers/vm-image/"
ID=cos

Kernel (e.g. uname -a): Linux e2e-test-dyzz-master 4.4.111+ #1 SMP Wed Mar 28 13:44:38 PDT 2018 x86_64 Intel(R) Xeon(R) CPU @ 2.30GHz GenuineIntel GNU/Linux

/kind flake /kind bug /sig storage /cc @msau42 @jingxu97 @saad-ali @verult

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 45 (38 by maintainers)

Commits related to this issue

volumemanager: avoid removing device path information after unmount There is a race between cleaning up after one pod stopped using a volume on a node and another pod starts using it. When it fails, ... — committed to pohly/kubernetes by pohly 6 years ago

Most upvoted comments

I will have a look. Right now the log output of the sidecar containers and the CSI driver are not captured, which makes debugging a bit hard. In PR #69868 I am adding code which includes those message in the test log. Let me finish that PR and once it is merged, debugging the next test flake might provide some better insight into what went wrong.

pohly on Oct 18, 2018

@saad-ali @msau42 what does it mean from the user impact perspective for CSI to not be backward compatible with 1.12? can you please point me to any discussions on this behavior for 1.13?

Kubernetes v1.13 speaks CSI v1.0 and no longer understands CSI v0.x. Any CSI drivers deployed on previous versions of Kubernetes speaking CSI v0.x must removed, updated to be CSI 1.0 compatible, and redeployed on the on Kubernetes v1.13.

If only the master is upgraded to 1.13 (and nodes are 1.12 or earlier) older drivers will continue to work. When nodes are upgraded to 1.13, then older drivers will stop working.

I will capture this in detail in the release notes “Action Required” section.

saad-ali on Nov 19, 2018

@pohly Instead of https://github.com/kubernetes/kubernetes/pull/70746 which may have negative side effects for non-CSI volumes, let’s do this instead:

Modify the CSI WaitForAttach(...) method in https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_attacher.go#L126 to ignore the passed in attachID string (which is populated from devicePath. Instead let’s have that method generate the attachId from scratch, e.g. attachID := getAttachmentName(volID, driverName, string(nodeName))

This will remove any CSI dependency on devicePath and not effect other volume plugins.

saad-ali on Nov 14, 2018