efcore: SaveChangesAsync doesn't preserve the sync context across retries

In this Helix build we have the following test failure, seems to be flaky:

    Microsoft.EntityFrameworkCore.Query.QueryBugsTest.SaveChangesAsync_accepts_changes_with_ConfigureAwait_true_22841 [FAIL]
      Assert.True() Failure
      Expected: True
      Actual:   False
      Stack Trace:
        D:\a\_work\1\s\test\EFCore.SqlServer.FunctionalTests\Query\QueryBugsTest.cs(9132,0): at Microsoft.EntityFrameworkCore.Query.QueryBugsTest.SaveChangesAsync_accepts_changes_with_ConfigureAwait_true_22841()
        --- End of stack trace from previous location ---

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (15 by maintainers)

Commits related to this issue

Most upvoted comments

@vonzshik that’s exactly what #27019 is supposed to address (I did it just 4 hours before your comment 😉).

However, I can’t see how SaveChangesAsync wouldn’t yield… it’s guaranteed to insert a row into SQL Server, which is supposed to always do networking and yield… If there’s some situation in which SqlClient does sync I/O instead of async, then that would explain it; I’m not aware of such a thing (and it shouldn’t happen), but it’s possible.

@roji what if the problem here is in Thread.CurrentThread == trackingSynchronizationContext.Thread check? That is, while an async version of SaveChangesAsync is called, there is no guarantee that at any point it’s actually going to yield, so it’s not going to execute a continuation on that thread?

If so, calling await Task.Yield() just after the context is set should help.

@roji SaveChangesAsync_accepts_changes_with_ConfigureAwait_true_22841 just failed on my C.I. run.

https://github.com/dotnet/efcore/pull/26996

@vonzshik you’re right - I just discovered that on my own; though that’s just a test issue (note also that the test asserts on the current thread, rather than on the current sync context).

Here’s what I think is going on… Under normal circumstances, everything does work just fine; before AcceptAllChanges is called (the method that jumps into user code), all code is either synchronous, or asynchronous with ConfigureAwait(true).

However, if we retry because of an actual error, then the 2nd attempt by the execution strategy will have lost the synchronization context, since the 1st attempt will have been called with ConfigureAwait(false). This would also explain the flakiness of the test (as opposed to it always passing/failing) - we indeed have occasional errors when running our tests in the CI environment, so I’m thinking that in those cases the test failed.