kubernetes: Wrong AWS volume can be mounted

Background: K8s often has problems attaching/detaching AWS volumes: If I do a kubectl replace on a deployment, I frequently observe K8s not correctly detaching the volume, and the new deployment will be forever stuck in a crash loop because it times out getting the new volume. Seems like a race condition. The steps I always do when this happens is to delete the deployment, manually force-detach via EC2 API, and manually umount the mounts, otherwise they never seem to be cleaned up.

So today I experienced this again, and I did the following:

  1. Delete deployment.
  2. Manually detach volume via EC2 API, since it was not being released.
  3. Manually umount wrong volume. I accidentally unmounted the volume of a different pod because I didn’t read the mount output correctly. Oops.
  4. Created deployment.
  5. Noticed my mistake.

What happened was that the new deployment was mounted with the wrong device:

$ mount | grep xvdb
/dev/xvdba on /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bf8343563ee5aa type ext4 (rw,relatime,data=ordered)
/dev/xvdba on /var/lib/kubelet/pods/a8f54bf6-4ebd-11e6-89fc-065b459e106f/volumes/kubernetes.io~aws-ebs/data type ext4 (rw,relatime,data=ordered)

This is the right volume ID, but it has mounted the wrong device. According to EC2, the (right) volume is attached to /dev/xvdbb.

$ aws ec2 describe-volumes --volume-ids vol-0e3bf8343563ee5aa
{
    "Volumes": [
        {
            "AvailabilityZone": "us-east-1e", 
            "Attachments": [
                {
                    "AttachTime": "2016-07-20T21:05:17.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e3bf8343563ee5aa", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdbb"
                }
            ], 
            "Tags": [],
            "Encrypted": false, 
            "VolumeType": "gp2", 
            "VolumeId": "vol-0e3bf8343563ee5aa", 
            "State": "in-use", 
            "Iops": 100, 
            "SnapshotId": "", 
            "CreateTime": "2016-07-19T00:25:00.168Z", 
            "Size": 30
        }
    ]
}

/dev/xvdba is the volume I accidentally unmounted by mistake:

            // ...
            "Attachments": [
                {
                    "AttachTime": "2016-07-19T17:56:27.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e0a13f1b3a4a6dbc", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdba"
                }
            ], 

So it’s mounting the wrong volume into the pod.

Of course I was being naughty, but it seems to me that this should never happen, since EC2 has already allocated xvdbb?

Here’s the entire KCM output from this fiasco.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 89 (60 by maintainers)

Commits related to this issue

Most upvoted comments

Are there plans to backport this into 1.3? This is affecting many of us that are currently running 1.3.

@fabioy @timstclair Bringing this to your attention as you are both the release czars for 1.3. This is worth including in 1.3.8. Possibly even cutting a 1.3.8.

With @justinsb help, I am able to reproduce the error and also have a fix out PR #33796

To summarize, there are two approaches to produce this error.

  1. cordon nodes; delete pods; uncordon node; (re)create pods
  2. terminate nodes; pods evicted (deleted); (re)create pods Although both approaches can produce the same error, they are caused by very different issues. Last time we fix issue 1 (PR #32242), but not issue 2.

I am also working on a proposal to make the system more robust to this issue in PR #33760

Sorry for any inconvenience caused by this, and please let me know if you have any questions.

I’d appreciate if this was still open until the backport was complete. This issue is really hurting us in 1.3.5.

On Wed, Sep 21, 2016 at 5:22 AM, Jing Xu notifications@github.com wrote:

@pwittrock https://github.com/pwittrock I am wondering whether we can leave it open until the fix is backported and users confirm the fix works fine?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/29324#issuecomment-248404799, or mute the thread https://github.com/notifications/unsubscribe-auth/ASv7YK4vLkXWxQm2ydT4m1mdr7pr8VCZks5qsDJ9gaJpZM4JROjc .

For others running into this problem. We’ve disabled the new controller attach/detach logic for now by passing the flag --enable-controller-attach-detach=false to kubelet. Volumes are now mounted correctly.