aws-ebs-csi-driver: Race condition in CreateVolume

/kind bug

What happened?

When the CO calls CreateVolume (through the sidecar container), it passes in a volume name. If a second call to create the volume is issued, we return the same volume created in the first request.

Since there’s no way to specify the volume name through the EC2 API, we currently set the volume name as a tag. However, this workaround has its own problems.

Imagine this situation:

  • CreateVolume is called so a volume named volume1 is created.
  • EC2’s CreateVolumeWithContext is called and it takes about 15 seconds to effectively create the volume.
  • CreateVolume is called again, but the call above is not done yet.
  • The driver will check if a volume with this name already exists, but it won’t find any, so it creates the new volume.
  • Now we have 2 volumes tagged with the same name.

During attach, for instance, the driver will notice that there’s more than one volume with the same name, so it’ll return an error. However, this is just a workaround and a proper solution should ideally be implemented on the EC2 side.

This could be achieved by having a Name filed in ec2.CreateVolumeInput.

How to reproduce it (as minimally and precisely as possible)?

The easiest way is to:

  1. Run the CSI driver
  2. Use the csc tool to create 2 volumes:
$ csc controller new --endpoint tcp://127.0.0.1:10000 (..) && \
  csi controller new --endpoint tcp://127.0.0.1:10000(...)

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 20 (13 by maintainers)

Commits related to this issue

Most upvoted comments

@AndyXiangLi FYI. Can you follow-up internally for the API change?

This issue is mitigated for now by using the inFlight struct. Once an EBS API change is available that enables idempotency, this in-flight checking should be removed from the CreateVolume() call.