rexray: rexray/ebs on AWS and docker swarm, problem with device discovery

Summary

When creating a container with an EBS volume docker returns with an error “problem with device discovery”.

Bug Reports

Version

rexray/ebs:latest

Expected Behavior

  1. EBS Volume Created in AWS
  2. Volume attaches to instance.
  3. Docker mounts volume into container.
  4. Container starts.

Actual Behavior

  1. EBS Volume Created in AWS
  2. Volume attaches to instance
  3. Docker errors out with error docker: Error response from daemon: error while mounting volume '': VolumeDriver.Mount: docker-legacy: Mount: test: failed: problem with device discovery.

Steps To Reproduce

  1. Create AWS instances using latest Amazon Linux AMI
  2. Install docker
  3. Set up swarm
  4. Install rexray/ebs following guide on each instance
  5. Create test container using rexray/ebs

See below for a full console output. Interestingly, in my troubleshooting the volume will not delete either until I manually detach it from the instance in AWS.

[ec2-user@xxxxxxx]$ sudo docker plugin install rexray/ebs REXRAY_PREEMPT=true EBS_REGION=ap-southeast-2 EBS_ACCESSKEY=AKIAJJEZLCOGWWU2S26Q EBS_SECRETKEY=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
Plugin "rexray/ebs" is requesting the following privileges:
 - network: [host]
 - mount: [/dev]
 - allow-all-devices: [true]
 - capabilities: [CAP_SYS_ADMIN]
Do you grant the above permissions? [y/N] y
latest: Pulling from rexray/ebs
d726edfb6c8d: Download complete
Digest: sha256:b5b0680b6dd53a4c61bac21942738a69edfe3d6a6516acc13045d257889c9213
Status: Downloaded newer image for rexray/ebs:latest
Installed plugin rexray/ebs
[ec2-user@xxxxxxx]$ docker run -ti --volume-driver=rexray/ebs -v test:/test busybox
docker: Error response from daemon: error while mounting volume '': VolumeDriver.Mount: docker-legacy: Mount: test: failed: problem with device discovery.
[ec2-user@xxxxxxx]$ docker run -ti --volume-driver=rexray/ebs -v test:/test busybox
docker: Error response from daemon: error while mounting volume '/var/lib/docker/plugins/1b5875fe11fd6f46eebb647c4b7986039a7d2955f341f69561e0080ae9f2fa80/rootfs': VolumeDriver.Mount: docker-legacy: Mount: test: failed: no device name returned.
[ec2-user@xxxxxxx]$ docker volume inspect test
[
    {
        "CreatedAt": "0001-01-01T00:00:00Z",
        "Driver": "rexray/ebs:latest",
        "Labels": null,
        "Mountpoint": "",
        "Name": "test",
        "Options": null,
        "Scope": "global",
        "Status": {
            "availabilityZone": "ap-southeast-2c",
            "fields": null,
            "iops": 0,
            "name": "test",
            "server": "ebs",
            "service": "ebs",
            "size": 16,
            "type": "standard"
        }
    }
]
[ec2-user@xxxxxxx]$ docker volume rm test
Error response from daemon: remove test: volume is in use - [314b5a64875451191fb9dd52bca19d073a858a567d244a733e222f5c7763c208, c319647db5a63e139be2c0a1481054765a730aa025c8955d6d7194e28a3d2d88]
[ec2-user@xxxxxxx]$ docker container rm 314b5a64875451191fb9dd52bca19d073a858a567d244a733e222f5c7763c208 c319647db5a63e139be2c0a1481054765a730aa025c8955d6d7194e28a3d2d88
314b5a64875451191fb9dd52bca19d073a858a567d244a733e222f5c7763c208
c319647db5a63e139be2c0a1481054765a730aa025c8955d6d7194e28a3d2d88
[ec2-user@xxxxxxx]$ docker volume rm test
Error response from daemon: remove test: VolumeDriver.Remove: docker-legacy: Remove: test: failed: error deleting volume

here’s where I detach the volume in the EC2 console

[ec2-user@xxxxxxx]$ docker volume rm test
test

Configuration Files

No applicable, using the docker plugin. But here’s the output of docker plugin inspect

[
    {
        "Config": {
            "Args": {
                "Description": "",
                "Name": "",
                "Settable": null,
                "Value": null
            },
            "Description": "REX-Ray for Amazon EBS",
            "DockerVersion": "18.05.0-ce",
            "Documentation": "https://github.com/thecodeteam/rexray/.docker/plugins/ebs",
            "Entrypoint": [
                "/rexray.sh",
                "rexray",
                "start",
                "-f",
                "--nopid"
            ],
            "Env": [
                {
                    "Description": "",
                    "Name": "DOCKER_LEGACY",
                    "Settable": [
                        "value"
                    ],
                    "Value": "true"
                },
                {
                    "Description": "",
                    "Name": "REXRAY_FSTYPE",
                    "Settable": [
                        "value"
                    ],
                    "Value": "ext4"
                },
                {
                    "Description": "",
                    "Name": "REXRAY_LOGLEVEL",
                    "Settable": [
                        "value"
                    ],
                    "Value": "warn"
                },
                {
                    "Description": "",
                    "Name": "REXRAY_PREEMPT",
                    "Settable": [
                        "value"
                    ],
                    "Value": "false"
                },
                {
                    "Description": "",
                    "Name": "HTTP_PROXY",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "LIBSTORAGE_INTEGRATION_VOLUME_OPERATIONS_MOUNT_ROOTPATH",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "LINUX_VOLUME_ROOTPATH",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "LINUX_VOLUME_FILEMODE",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_ACCESSKEY",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_ENDPOINT",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_KMSKEYID",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_MAXRETRIES",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_REGION",
                    "Settable": [
                        "value"
                    ],
                    "Value": "us-east-1"
                },
                {
                    "Description": "",
                    "Name": "EBS_SECRETKEY",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_STATUSINITIALDELAY",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_STATUSMAXATTEMPTS",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_STATUSTIMEOUT",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "EBS_USELARGEDEVICERANGE",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "",
                    "Name": "CSI_ENDPOINT",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "The name of the CSI plug-in used by the CSI module.",
                    "Name": "X_CSI_DRIVER",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                },
                {
                    "Description": "A flag that disables the CSI to libStorage bridge.",
                    "Name": "X_CSI_NATIVE",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                }
            ],
            "Interface": {
                "Socket": "rexray.sock",
                "Types": [
                    "docker.volumedriver/1.0"
                ]
            },
            "IpcHost": false,
            "Linux": {
                "AllowAllDevices": true,
                "Capabilities": [
                    "CAP_SYS_ADMIN"
                ],
                "Devices": null
            },
            "Mounts": [
                {
                    "Description": "",
                    "Destination": "/dev",
                    "Name": "",
                    "Options": [
                        "rbind"
                    ],
                    "Settable": null,
                    "Source": "/dev",
                    "Type": "bind"
                }
            ],
            "Network": {
                "Type": "host"
            },
            "PidHost": false,
            "PropagatedMount": "/var/lib/rexray",
            "User": {},
            "WorkDir": "",
            "rootfs": {
                "diff_ids": [
                    "sha256:3e6c5310eb0c1fcbe49312bae86e363bf7ef7be4dd9e0dcd2206200b265a0693"
                ],
                "type": "layers"
            }
        },
        "Enabled": true,
        "Id": "1b5875fe11fd6f46eebb647c4b7986039a7d2955f341f69561e0080ae9f2fa80",
        "Name": "rexray/ebs:latest",
        "PluginReference": "docker.io/rexray/ebs:latest",
        "Settings": {
            "Args": [],
            "Devices": [],
            "Env": [
                "DOCKER_LEGACY=true",
                "REXRAY_FSTYPE=ext4",
                "REXRAY_LOGLEVEL=warn",
                "REXRAY_PREEMPT=true",
                "HTTP_PROXY=",
                "LIBSTORAGE_INTEGRATION_VOLUME_OPERATIONS_MOUNT_ROOTPATH=",
                "LINUX_VOLUME_ROOTPATH=",
                "LINUX_VOLUME_FILEMODE=",
                "EBS_ACCESSKEY=ABC",
                "EBS_ENDPOINT=",
                "EBS_KMSKEYID=",
                "EBS_MAXRETRIES=",
                "EBS_REGION=ap-southeast-2",
                "EBS_SECRETKEY=123",
                "EBS_STATUSINITIALDELAY=",
                "EBS_STATUSMAXATTEMPTS=",
                "EBS_STATUSTIMEOUT=",
                "EBS_USELARGEDEVICERANGE=",
                "CSI_ENDPOINT=",
                "X_CSI_DRIVER=",
                "X_CSI_NATIVE="
            ],
            "Mounts": [
                {
                    "Description": "",
                    "Destination": "/dev",
                    "Name": "",
                    "Options": [
                        "rbind"
                    ],
                    "Settable": null,
                    "Source": "/dev",
                    "Type": "bind"
                }
            ]
        }
    }
]

Logs

I’d love to provide this, but I can’t for the life of me find any logs output by the docker plugin, happy to run commands though.

Edit: Formatting

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 5
  • Comments: 36

Most upvoted comments

Thanks for the fix. Merged into my personal fork at https://github.com/nbryant42/rexray/tree/revert-1345

This branch will likely receive occasional dependency version updates going forward, at least as long as I’m at my current employer and continuing to use rexray.

Builds at https://gallery.ecr.aws/semsee/rexray-ebs

I have started to have this problem today in one of my environments… Am I the only one?

@nbryant42 rased nbryant42#1 target branch revert-1345

I’ve built and published the rex-ray EBS plugin for linux/amd64 here, based on this repository https://github.com/joan-s-molas/rexray/tree/fix/rexray

https://hub.docker.com/r/umsp/rexray-ebs

Tested on my workloads and it’s working. Make sure to install it with an alias like this:

EBS_USELARGEDEVICERANGE=true docker plugin install umsp/rexray-ebs:nvme-fix --alias rexray/ebs REXRAY_PREEMPT=true EBS_REGION=${aws_region} --grant-all-permissions

@bdellegrazie, is this specific to particular instance types? I am only using c6a., m6, and t3a.*

@nbryant42 , @ScOut3R and everyone else. The root cause of this is a change in the underlying physical infrastructure used by AWS to service various instance types Newer EBS controllers use a different vendor device block with a different structure. It’s absolutely repeatable - as soon as the nvme storage starts reporting what looks like v2 of the vendor block structure, rexray without the patch will fail.

The patch works because we use AWS’s own tool (ebsnvme-id) to get the configured symlink name rather than relying on the older nvme tool and trying to decode the vendor block directly.

You might get lucky by choosing a different instance type but it’s not predictable.

@nbryant42 I will fixate the FROM python:3-alpine as suggested, but using alpine 3.17 rather than 3.18 because of the rest of the code.

I too have this issue with 0.11.3. I’m starting to wonder if anyone is successfully using the EBS plugin on a swarm deployment…

I’ve fixed the problem by switching out of the t3.* instances. Looks like their storage drivers aren’t compatible with rexray’s.

@st3h3n my image is built via Docker 20.10.21 for the x86_64 architecture. First thing is to check you’re not on the wrong arch or too-old Docker.

I’ve never seen that issue before but a quick Google turns up https://github.com/distribution/distribution/issues/1439 which recommends some cleanup steps (scroll to the bottom)

There are some other Google hits, this generally sounds like a corruption issue within your local Docker storage.

@nbryant42 rased nbryant42#1 target branch revert-1345

I’ve built and published the rex-ray EBS plugin for linux/amd64 here, based on this repository https://github.com/joan-s-molas/rexray/tree/fix/rexray

https://hub.docker.com/r/umsp/rexray-ebs

Tested on my workloads and it’s working. Make sure to install it with an alias like this:

EBS_USELARGEDEVICERANGE=true docker plugin install umsp/rexray-ebs:nvme-fix --alias rexray/ebs REXRAY_PREEMPT=true EBS_REGION=${aws_region} --grant-all-permissions

This has worked perfectly!

@ScOut3R you can fix that issue by creating an artificial tag like this:

git tag v420.0.0 -m "rexray nvme fix"

FROM python:3-alpine3.18 might be preferable.

For all those struggling with this we have a patch to fix the EBS related issue, I do not believe this will help anyone with issues on Docker Swarm (unless they’re using EBS underneath). This is the patch It is a small patch on top of @nbryant42 's most recent work: https://github.com/nbryant42/rexray/tree/revert-1345

Same here, same AMI image as before but today it is not working anymore.

Edit: Changing to a T2.* instance worked for me. Unclear why this is now a issue and before today running a T3a.* not.

This is seems not supported any more, what is the migration tool of this project?