kvrocks: Add the support of the blocking migration for the cluster migrate command

Search before asking

  • I had searched in the issues and found no similar issues.

Motivation

For now the command to migrate slot clusterx migrate works asynchronously. It will return “OK” right after the migration has started. If the client wants to know the result of the migration, it has to constantly use “cluster info” to get the state.

However, once the migration has been completed, the target slot will be forbidden, and the client (cluster manager) has to actively change the topology as soon as possible, in order to make the cluster continue to work.

In summary, the current clusterx migrate has two shortcuts:

  1. The client cannot get the result soon enough.
  2. The “polling” approach to get the migration result is not efficient enough.

Solution

I propose to leave the old clusterx migrate command intact, and add a new clusterx bmirate command, which works like other blocking commands (e.g. blpop). The server will block the connection and return “OK” (if succeed) or other error message (if failed) to the client only after the migration has completed.

The command shall be like:

clusterx bmirate ${SLOT} ${TARGET_NODEID} ${TIMEOUT}

Are you willing to submit a PR?

  • I’m willing to submit a PR!

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

As I think, an ASYNC|SYNC flag is best choice and will be using in some other commands, like BACKUP.

I prefer to use SYNC|ASYNC flag instead of another command which has similar arguments check and logic. in new redis version, they don’t want to add more commands, such as flush command, it has ASYNC option.

For lpop and blpop, they operate data space, different commands have different flags so redis can distinguish which command can execute in scripts env easily.

I respect the reasoning about introducing a separate command for a blocking version of the command: it’s clearer, descriptive and consistent with some naming conventions of Redis (e. g. POP/BPOP). On the other hand, if I use the blocking version of the MIGRATE command by specifying the proper options, it’s obvious (at least for me) that it will block and the OK return value means “blocking operation is completed” (in other words, migration completed). And one quick note: personally, I don’t like commands with a bunch of options (three and more). They are harder to implement and test. So, maybe sometimes (but I’m not 100% sure that this case is the one), instead of introducing 2-3 additional options, we can create a new command.

Looks like the option of SYNC|ASYNC flag got more votes. If there is no further advice, I will implement the new feature as following:

clusterx migrate ${SLOT} ${NODE_ID} [SYNC|ASYNC] ${TIMEOUT}
  • If the ASYNC is given or the flag is missing, the command will act in the old way.
  • If the SYNC flag is given:
    • If the timeout is missing or “0”, the command will block until the migration finish.
    • If the timeout comes as a positive integer, it means the max duration (of seconds) for which the command will block.

vote for the SYNC|ASYNC flag

Thanks, @ellutionist @torwig @xiaobiaozhao @aleksraiden.

So our consensus is to use bmigrate instead of adding the SYNC|ASYNC flag, it’s also good to me.