rexray: cinder: Volume detection & detach fails w Docker Swarm
Summary
When I create stateful services in swarm mode and they fail, or a node goes down and the service gets rescheduled, the stateful services keep on having cinder errors. This makes it a manual process to remove each of these services manually unmount the volumes with cinder or openstack panel, and redeploy the stacks. If there is a sudden failure, this would effectively defeat the purpose of swarm orchestration…
Bug Reports
Cinder driver fails to detach or detect volume in docker swarm mode. I currently have settings like recommended here.
Version
$ rexray version
REX-Ray
-------
Binary: /usr/bin/rexray
Flavor: client+agent+controller
SemVer: 0.9.1
OsArch: Linux-x86_64
Branch: v0.9.1
Commit: 2373541479478b817a8b143629e552f404f75226
Formed: Fri, 09 Jun 2017 19:23:38 UTC
libStorage
----------
SemVer: 0.6.1
OsArch: Linux-x86_64
Branch: v0.9.1
Commit: fd26f0ec72b077ffa7c82160fbd12a276e12c2ad
Formed: Fri, 09 Jun 2017 19:23:05 UTC
Expected Behavior
The failed services should restart with the appropriate volumes mounted without any errors.
Actual Behavior
Getting either: VolumeDriver.Mount: {“Error”:“error detaching volume”} or VolumeDriver.Mount: {“Error”:“open /dev/vdc: no such file or directory”}
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 34 (17 by maintainers)
@codenrhoden Thanks a lot for fixing this issue. Testing on openstack && docker swarm is looking very good! Greetings from Munich.
Hello, in fact i’m not right at all 😃 I changed the code, and it’s gophercloud that has an error into documentation. After reviewing the whole api calls, volumeId the indeed the good one, but the problem is related to instance Id, cinder must reuse the instance id provided into the attachment structure, and not the one provided by the auth. Even with this fix, there’s more issues later, related to the mount.
@MatMaul
Yeah, I see that Detach is always using the IID to do the detach, e.g. https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/cinder/storage/cinder_storage.go#L643-L646
Instead, detach needs to use the ID of the node the volume is attached to. REX-Ray builds off of the paradigm that a block volume can only be attached to one node at a time, so detach methods generally don’t care who the caller is. Depending on the cloud/storage provider’s API, either it detaches the volume and this happens everywhere automatically (EBS does this) or it has to cycle through all the places where it is attached and detaches each one by one. GCE does this: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/gcepd/storage/gce_storage.go#L1050-L1065
The expected behavior is that
PREEMPTgets translated toopts.Forcein an attach call. During an attach, if the volume is already attached anywhere, it’s an error. Unlessforceis true, then we are to callDetachto force/pre-empt the existing mount and move it to the newly requested node.opts.Forcedoesn’t usually come into play forDetach. Only if there is a concept of a “force detach”, which EBS does have: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/ebs/storage/ebs_storage.go#L685-L688. That’s the only one I know of off the top of my head…So, the main thing that needs to change here for Cinder is that Detach needs to query the volume to see where it is attached, and detach it from any and all nodes. I’m not sure what the
opts.Forcefield is doing here: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/cinder/storage/cinder_storage.go#L654 Is this a v1 vs v2 thing? Not sure, but I suspect that it isn’t needed.The attach logic in Cinder is also a bit weird re:
Force: https://github.com/codedellemc/rexray/blob/master/libstorage/drivers/storage/cinder/storage/cinder_storage.go#L593-L602It’s only querying the state of the volume if force is specified. Most other drivers always query it, then change their behavior based on
Force. For example, if it’s already attached but force is false, it’s an error. As written, the code would have the weird behavior of if you calledAttachwith an IID where the volume was already attached and force was true, it would detach it from the node and then just reattach it. The EBS logic may be helpful here: https://github.com/codedellemc/rexray/blob/3dce0e4a20cee6ce8878a95734315b8565d5d3aa/libstorage/drivers/storage/ebs/storage/ebs_storage.go#L604-L619Not sure what “list” you are talking about? Are you talking about a list in the libStorage client? If so, this does happen and can be run on any node, not necessarily where the libStorage server is running, so no it can’t be shared. If you are talking about the
VolumeInspecthere, the best way to handle that is to refactorVolumeDetachto use a privated.detach()method that bothVolumeAttachandVolumeDetachcan call. GCE is another good example of this. you’ll see thatdetachhelper function doesn’t have make any additional API calls to get the list of nodes that a volume is attached to.I know that’s a lot to throw at you, but it’s mostly explanation. The fix here is probably pretty straightforward.
Same for me:
Using the latest version of the rexray/cinder docker plugin on each node
docker plugin install --grant-all-permissions rexray/cinder:edge \ CINDER_AUTHURL=http://opencloud.xxxx.net:5000/v2.0 \ CINDER_USERNAME=xxxx \ CINDER_PASSWORD=xxxx \ CINDER_TENANTID=7c3a792a340e4cbc849abb4cd6d7adb2 \ REXRAY_FSTYPE=ext4 \ REXRAY_PREEMPT=trueI would love to see this working on swarm mode with a working rexray/cinder driver.
Whenever a node gets shut down or paused where a task of a service - i.e. Postgres - is running, rexray is not able to detach the volume and attach it to a different node.
As a consequence the docker swarm service is stopped, if your service is set up with just one replica.
Please help, as it is really quite frustrating.
using the command
docker service ps pgyou get this output:ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS u6fca591ek2c pg.1 postgres:latest cvm03265.xxxx.net Running Preparing 12 seconds ago 7mk20mywai8g \_ pg.1 postgres:latest cvm02741.xxxx.net Shutdown Rejected 16 seconds ago "VolumeDriver.Mount: {"Error":…"If you inspect this task using
docker inspect u6you get:{
Hi @benoitm76,
That depends, are those volumes attached to some node? If so, then yes,
unavailableis the appropriate status. It indicates the volume is attached to some instance other than the one specified in the API call that requested the volume information.Here is a full list of the volume status values and their descriptions:
unknownattachedavailableunavailable