kubernetes: Wrong AWS volume can be mounted

Background: K8s often has problems attaching/detaching AWS volumes: If I do a kubectl replace on a deployment, I frequently observe K8s not correctly detaching the volume, and the new deployment will be forever stuck in a crash loop because it times out getting the new volume. Seems like a race condition. The steps I always do when this happens is to delete the deployment, manually force-detach via EC2 API, and manually umount the mounts, otherwise they never seem to be cleaned up.

So today I experienced this again, and I did the following:

Delete deployment.
Manually detach volume via EC2 API, since it was not being released.
Manually umount wrong volume. I accidentally unmounted the volume of a different pod because I didn’t read the mount output correctly. Oops.
Created deployment.
Noticed my mistake.

What happened was that the new deployment was mounted with the wrong device:

$ mount | grep xvdb
/dev/xvdba on /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bf8343563ee5aa type ext4 (rw,relatime,data=ordered)
/dev/xvdba on /var/lib/kubelet/pods/a8f54bf6-4ebd-11e6-89fc-065b459e106f/volumes/kubernetes.io~aws-ebs/data type ext4 (rw,relatime,data=ordered)

This is the right volume ID, but it has mounted the wrong device. According to EC2, the (right) volume is attached to /dev/xvdbb.

$ aws ec2 describe-volumes --volume-ids vol-0e3bf8343563ee5aa
{
    "Volumes": [
        {
            "AvailabilityZone": "us-east-1e", 
            "Attachments": [
                {
                    "AttachTime": "2016-07-20T21:05:17.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e3bf8343563ee5aa", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdbb"
                }
            ], 
            "Tags": [],
            "Encrypted": false, 
            "VolumeType": "gp2", 
            "VolumeId": "vol-0e3bf8343563ee5aa", 
            "State": "in-use", 
            "Iops": 100, 
            "SnapshotId": "", 
            "CreateTime": "2016-07-19T00:25:00.168Z", 
            "Size": 30
        }
    ]
}

/dev/xvdba is the volume I accidentally unmounted by mistake:

            // ...
            "Attachments": [
                {
                    "AttachTime": "2016-07-19T17:56:27.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e0a13f1b3a4a6dbc", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdba"
                }
            ],

So it’s mounting the wrong volume into the pod.

Of course I was being naughty, but it seems to me that this should never happen, since EC2 has already allocated xvdbb?

Here’s the entire KCM output from this fiasco.

About this issue

Original URL
State: closed
Created 8 years ago
Comments: 89 (60 by maintainers)

Commits related to this issue

Harder e2e test for volume mounting Still doesn't actually fail in my testing though. Issue #29324 — committed to justinsb/kubernetes by justinsb 8 years ago
AWS: move volume attachment map to cloud level The problem is that attachments are now done on the master, and we are only caching the attachment map persistently for the local instance. So there is... — committed to justinsb/kubernetes by justinsb 8 years ago
AWS: Sanity checks after volume attach In the light of issue #29324, double check that the volume was attached correctly where we expect it, before returning. Issue #29324 — committed to justinsb/kubernetes by justinsb 8 years ago
Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for nod... — committed to jingxu97/kubernetes by jingxu97 8 years ago
Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for nod... — committed to jingxu97/kubernetes by jingxu97 8 years ago
Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for nod... — committed to jingxu97/kubernetes by jingxu97 8 years ago
Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for nod... — committed to jingxu97/kubernetes by jingxu97 8 years ago
Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for nod... — committed to jingxu97/kubernetes by jingxu97 8 years ago
Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for nod... — committed to jingxu97/kubernetes by jingxu97 8 years ago
Merge pull request #32242 from jingxu97/bug-wrongvolume-9-2 Automatic merge from submit-queue Fix race condition in updating attached volume between master and node This PR tries to fix issue #2932... — committed to kubernetes/kubernetes by deleted user 8 years ago
Merge pull request #32242 from jingxu97/bug-wrongvolume-9-2 Automatic merge from submit-queue Fix race condition in updating attached volume between master and node This PR tries to fix issue #2932... — committed to eparis/kubernetes by deleted user 8 years ago
Fix race condition in updating attached volume between master and node This PR tries to fix issue #29324. This cause of this issue is a race condition happens when marking volumes as attached for nod... — committed to jingxu97/kubernetes by jingxu97 8 years ago

Most upvoted comments

Are there plans to backport this into 1.3? This is affecting many of us that are currently running 1.3.

robdaemon on Sep 8, 2016

@fabioy @timstclair Bringing this to your attention as you are both the release czars for 1.3. This is worth including in 1.3.8. Possibly even cutting a 1.3.8.

matchstick on Sep 17, 2016

With @justinsb help, I am able to reproduce the error and also have a fix out PR #33796

To summarize, there are two approaches to produce this error.

cordon nodes; delete pods; uncordon node; (re)create pods
terminate nodes; pods evicted (deleted); (re)create pods Although both approaches can produce the same error, they are caused by very different issues. Last time we fix issue 1 (PR #32242), but not issue 2.

I am also working on a proposal to make the system more robust to this issue in PR #33760

Sorry for any inconvenience caused by this, and please let me know if you have any questions.

jingxu97 on Sep 30, 2016

I’d appreciate if this was still open until the backport was complete. This issue is really hurting us in 1.3.5.

On Wed, Sep 21, 2016 at 5:22 AM, Jing Xu notifications@github.com wrote:

@pwittrock https://github.com/pwittrock I am wondering whether we can leave it open until the fix is backported and users confirm the fix works fine?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/29324#issuecomment-248404799, or mute the thread https://github.com/notifications/unsubscribe-auth/ASv7YK4vLkXWxQm2ydT4m1mdr7pr8VCZks5qsDJ9gaJpZM4JROjc .

trizvanov on Sep 20, 2016

For others running into this problem. We’ve disabled the new controller attach/detach logic for now by passing the flag --enable-controller-attach-detach=false to kubelet. Volumes are now mounted correctly.

svanderbijl on Aug 30, 2016