tikv: Async commit does not ensure linearizability

Bug Report

Read in the same session

Read with max u64 immediately after an async commit transaction. Because the lock in the previous transaction may be not committed yet, and locks can be ignored if we read with max u64. So we may not read the latest changes from the last async commit transaction.

Causal consistency across nodes

Example:

At first the values of k1 and k2 are all 0, they can locate in different regions.

We commit k1=1@100 at first. The max_read_ts of the k2 region leader is 50.

Then, we prewrite and commit k2=1@51 using async commit.

Now, a transaction with start_ts=80, reads k1 and k2 in the same transaction. We get k1=0, k2=1.

We break an external consistency that k2 commits later than k1.

Affected version

master

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 3
  • Comments: 31 (28 by maintainers)

Most upvoted comments

@youjiali1995 A more concrete example:

1. An external client starts two sessions A and B.

2. Session A starts T1, start_ts = 1.

3. Session B starts T2, start_ts = 2.

4. Client starts session C, it starts a transaction T3, start_ts = 80.

5. Session A prewrites k1=1.

6. Session A commits T1 with commit_ts = 100.

7. Session B prewrites k2=1 using async commit, min_commit_ts = 51.

8. Session B commits T2 with commit_ts = 51.

9. Session C reads k1 and k2.

Session B triggers commit after it receives the success from session A (between 6 and 7)

BTW, maybe we can get a ts from PD before starting prewrite phase, then we can carry it with the prewrite request to primary key and calculate the max(ts, max_read_ts from all keys)+1 as the commit ts. The order of A and B can be ensured in this case because the new ts from PD must be greater than the A’s commit ts.(maybe equal?) I am not sure if it is enough for any situation. I think it’s enough. Maybe this change can be an option for users if they want truly external consistency.

@gengliqi Previously I talked with @5kbpers about this idea too. I think maybe it’s better to ensure external consistency by default because that’s what we guarantee before.