rexray: cinder: Volume detection & detach fails w Docker Swarm

Summary

When I create stateful services in swarm mode and they fail, or a node goes down and the service gets rescheduled, the stateful services keep on having cinder errors. This makes it a manual process to remove each of these services manually unmount the volumes with cinder or openstack panel, and redeploy the stacks. If there is a sudden failure, this would effectively defeat the purpose of swarm orchestration…

Bug Reports

Cinder driver fails to detach or detect volume in docker swarm mode. I currently have settings like recommended here.

Version

$ rexray version
REX-Ray
-------  
Binary: /usr/bin/rexray
Flavor: client+agent+controller
SemVer: 0.9.1  
OsArch: Linux-x86_64
Branch: v0.9.1
Commit: 2373541479478b817a8b143629e552f404f75226  
Formed: Fri, 09 Jun 2017 19:23:38 UTC
  
libStorage
----------  
SemVer: 0.6.1
OsArch: Linux-x86_64  
Branch: v0.9.1
Commit: fd26f0ec72b077ffa7c82160fbd12a276e12c2ad
Formed: Fri, 09 Jun 2017 19:23:05 UTC

Expected Behavior

The failed services should restart with the appropriate volumes mounted without any errors.

Actual Behavior

Getting either: VolumeDriver.Mount: {“Error”:“error detaching volume”} or VolumeDriver.Mount: {“Error”:“open /dev/vdc: no such file or directory”}

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 34 (17 by maintainers)

Most upvoted comments

@codenrhoden Thanks a lot for fixing this issue. Testing on openstack && docker swarm is looking very good! Greetings from Munich.

harrycain72 on Oct 14, 2017

Hello, in fact i’m not right at all 😃 I changed the code, and it’s gophercloud that has an error into documentation. After reviewing the whole api calls, volumeId the indeed the good one, but the problem is related to instance Id, cinder must reuse the instance id provided into the attachment structure, and not the one provided by the auth. Even with this fix, there’s more issues later, related to the mount.

unkusr007 on Sep 6, 2017

@MatMaul

Problem spotted: the detach takes the ID of the server on top of the volume ID, and we always take the ID of the server where we run the detach. @akutz what is the expected behavior ? How is PREEMPT handled ? I don’t have the option specified in my config and it looks like it calls VolumeDetach anyway. When happening on a remote server should I only detach if opts.Force is true ? or everytime ?

Yeah, I see that Detach is always using the IID to do the detach, e.g. https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/cinder/storage/cinder_storage.go#L643-L646

Instead, detach needs to use the ID of the node the volume is attached to. REX-Ray builds off of the paradigm that a block volume can only be attached to one node at a time, so detach methods generally don’t care who the caller is. Depending on the cloud/storage provider’s API, either it detaches the volume and this happens everywhere automatically (EBS does this) or it has to cycle through all the places where it is attached and detaches each one by one. GCE does this: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/gcepd/storage/gce_storage.go#L1050-L1065

The expected behavior is that PREEMPT gets translated to opts.Force in an attach call. During an attach, if the volume is already attached anywhere, it’s an error. Unless force is true, then we are to call Detach to force/pre-empt the existing mount and move it to the newly requested node.

opts.Force doesn’t usually come into play for Detach. Only if there is a concept of a “force detach”, which EBS does have: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/ebs/storage/ebs_storage.go#L685-L688. That’s the only one I know of off the top of my head…

So, the main thing that needs to change here for Cinder is that Detach needs to query the volume to see where it is attached, and detach it from any and all nodes. I’m not sure what the opts.Force field is doing here: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/cinder/storage/cinder_storage.go#L654 Is this a v1 vs v2 thing? Not sure, but I suspect that it isn’t needed.

The attach logic in Cinder is also a bit weird re: Force: https://github.com/codedellemc/rexray/blob/master/libstorage/drivers/storage/cinder/storage/cinder_storage.go#L593-L602

It’s only querying the state of the volume if force is specified. Most other drivers always query it, then change their behavior based on Force. For example, if it’s already attached but force is false, it’s an error. As written, the code would have the weird behavior of if you called Attach with an IID where the volume was already attached and force was true, it would detach it from the node and then just reattach it. The EBS logic may be helpful here: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/ebs/storage/ebs_storage.go#L604-L619

another question, it looks like rexray lists the volumes before calling detach. Is it possible to access the result of that list in VolumeDetach ? If not perhaps we should store it in the context, I would like to avoid to make a 3rd API call to retrieve an info provided by list (server ID where the volume is attached).

Not sure what “list” you are talking about? Are you talking about a list in the libStorage client? If so, this does happen and can be run on any node, not necessarily where the libStorage server is running, so no it can’t be shared. If you are talking about the VolumeInspect here, the best way to handle that is to refactor VolumeDetach to use a private d.detach() method that both VolumeAttach and VolumeDetach can call. GCE is another good example of this. you’ll see that detach helper function doesn’t have make any additional API calls to get the list of nodes that a volume is attached to.

I know that’s a lot to throw at you, but it’s mostly explanation. The fix here is probably pretty straightforward.

codenrhoden on Sep 22, 2017

Same for me:

Using the latest version of the rexray/cinder docker plugin on each node

docker plugin install --grant-all-permissions rexray/cinder:edge \ CINDER_AUTHURL=http://opencloud.xxxx.net:5000/v2.0 \ CINDER_USERNAME=xxxx \ CINDER_PASSWORD=xxxx \ CINDER_TENANTID=7c3a792a340e4cbc849abb4cd6d7adb2 \ REXRAY_FSTYPE=ext4 \ REXRAY_PREEMPT=true

I would love to see this working on swarm mode with a working rexray/cinder driver.

Whenever a node gets shut down or paused where a task of a service - i.e. Postgres - is running, rexray is not able to detach the volume and attach it to a different node.

As a consequence the docker swarm service is stopped, if your service is set up with just one replica.

Please help, as it is really quite frustrating.

using the command docker service ps pg you get this output:

ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS u6fca591ek2c pg.1 postgres:latest cvm03265.xxxx.net Running Preparing 12 seconds ago 7mk20mywai8g \_ pg.1 postgres:latest cvm02741.xxxx.net Shutdown Rejected 16 seconds ago "VolumeDriver.Mount: {"Error":…"

If you inspect this task using docker inspect u6 you get:

{

    "ID": "u6fca591ek2c22fw8vq20sa5l",
    "Version": {
        "Index": 85
    },
    "CreatedAt": "2017-09-16T10:48:20.674785226Z",
    "UpdatedAt": "2017-09-16T10:48:36.453805544Z",
    "Spec": {
        "ContainerSpec": {
            "Image": "postgres:latest@sha256:f594c0086738f808f3aa6de591b0c3349dd23ebcc42b45d16ed2ba1e9f09b02a",
            "Env": [
                "POSTGRES_PASSWORD=mysecretpassword"
            ],
            "Mounts": [
                {
                    "Type": "volume",
                    "Source": "volrexray",
                    "Target": "/var/lib/postgresql/data/",
                    "VolumeOptions": {
                        "DriverConfig": {
                            "Name": "rexray/cinder:edge"
                        }
                    }
                }
            ],
            "DNSConfig": {}
        },
        "Resources": {
            "Limits": {},
            "Reservations": {}
        },
        "RestartPolicy": {
            "Condition": "any",
            "MaxAttempts": 0
        },
        "Placement": {
            "Constraints": [
                "node.role == worker"
            ]
        },
        "ForceUpdate": 0
    },
    "ServiceID": "9n49riucyj9jo7uwk3d652g1k",
    "Slot": 1,
    "NodeID": "nap40xrbqqxr4vkqgmv8xh6uv",
    "Status": {
        "Timestamp": "2017-09-16T10:48:35.037852621Z",
        "State": "rejected",
        "Message": "preparing",
        "Err": "VolumeDriver.Mount: {\"Error\":\"error detaching volume\"}\n",
        "ContainerStatus": {},
        "PortStatus": {}
    },
    "DesiredState": "shutdown"
}

harrycain72 on Sep 16, 2017

Hi @benoitm76,

Is it normal to have all my volume in state unavailable when I am running rexray volume ls on my libstorage controller ?

That depends, are those volumes attached to some node? If so, then yes, unavailable is the appropriate status. It indicates the volume is attached to some instance other than the one specified in the API call that requested the volume information.

Here is a full list of the volume status values and their descriptions:

Status Code	Status Name	Description
1	`unknown`	The driver has set the state, but it is explicitly unknown and should not be inferred from the list of attachments alone.
2	`attached`	The volume is attached to the instance specified in the API call that requested the volume information.
3	`available`	The volume is not attached to any instance.
4	`unavailable`	The volume is attached to some instance other than the one specified in the API call that requested the volume information.

akutz on Sep 14, 2017