azure-sdk-for-net: [QUERY] How to handle etag mismatch (412 Precondition Failed) caused by client retry
Library name and version
Azure.Data.Tables 12.8.1
Query/Question
Consider this scenario:
- Azure.Data.Tables client updates an entity with an etag (
If-Match: <etag>
) - The server times out or returns a retriable status code after the data operation is complete on the server side.
- Client retries the operation with the same etag, not knowing the server state has changed
- Client encounters HTTP 412 Precondition failed
(there are adjacent scenarios like adding an entity for the first time and retrying to get a conflict)
What is the client to do to recover?
There are certainly solutions existing today, for example the client can retry the whole unit of work bounding the table storage operation (perhaps no-oping if the Table state actually went through or using a totally different entity depending on the implementation). However, sometimes the cost to retry the entire operation in the application is high enough that it’s worth taking on more complexity to handle the 412 gracefully (maybe there is bunch of other API calls prior to the Azure Table API call and you don’t want to repeat those in cases like this).
Simply detecting a retry in the Azure SDK HTTP pipeline is also not sufficient. Some retries will succeed (for example the request fails prior to the data operation persisting on the server side) and others will never succeed (for example another thread updates the entity, causing a legitimate Precondition Failed error).
One idea I wanted to try was performing a client-side diff. So if you hit an HTTP 412 (and optional only if and only if the client performed a retry as an optimization) you can fetch the entity from Table storage and compare the entity with what you were expecting to update. This could be done in a partial manner if you were hoping for a PATCH
-like operation or for all properties if you were hoping to replace the entire entity state.
A Table entity is essentially a property bag of well-defined types so performing a client-side diff is pretty straight-forward. Unfortunately, the intermediate model (Dictionary<string, object>
from ToOdataAnnotatedDictionary
) that would be ideal for generic comparison is internal
.
I can perhaps call the method with reflection or write my own serialization model used just for this comparison. But I wanted to raise this problem space to the Azure SDK team and see if there any prior art in this area or recommendations.
Here is a sample app that shows how client-driven retries can cause etag mismatches:
Sample C# console app
using System.Net;
using Azure;
using Azure.Core.Pipeline;
using Azure.Data.Tables;
var messageHandler = new TestHandler { InnerHandler = new SocketsHttpHandler() };
var serviceClient = new TableServiceClient(
"UseDevelopmentStorage=true",
new TableClientOptions { Transport = new HttpClientTransport(messageHandler) });
var table = serviceClient.GetTableClient("oopsie");
// setup
Console.WriteLine("Table go!");
await table.CreateIfNotExistsAsync();
Console.WriteLine("Entity go!");
var entity = new TableEntity("pk", "rk");
entity["State"] = "Active";
entity["HitCount"] = 1;
var response = await table.UpsertEntityAsync(entity, TableUpdateMode.Replace);
entity.ETag = response.Headers.ETag!.Value;
// repro
Console.WriteLine("Update with If-Match go!");
entity["HitCount"] = 2;
try
{
await table.UpdateEntityAsync(entity, entity.ETag, TableUpdateMode.Replace);
}
catch (RequestFailedException ex) when (ex.Status == 412)
{
Console.WriteLine($"Oopsie. {ex.Status} {ex.ErrorCode}");
// Get entity with client-side diff?
}
class TestHandler : DelegatingHandler
{
private int _requestCount;
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
var response = await base.SendAsync(request, cancellationToken);
if (request.Method == HttpMethod.Put && request.RequestUri!.AbsolutePath.EndsWith("(PartitionKey='pk',RowKey='rk')"))
{
if (Interlocked.Increment(ref _requestCount) == 2)
{
var statusCode = HttpStatusCode.ServiceUnavailable;
Console.WriteLine($"Fake error! ({statusCode})");
return new HttpResponseMessage(statusCode) { RequestMessage = request };
}
}
return response;
}
}
The output looks like this:
Table go!
Entity go!
Update with If-Match go!
Fake error! (ServiceUnavailable)
Oopsie. 412 UpdateConditionNotSatisfied
Environment
.NET SDK:
Version: 8.0.100-rc.1.23455.8
Commit: e14caf947f
Runtime Environment:
OS Name: Windows
OS Version: 10.0.22621
OS Platform: Windows
RID: win-x64
Base Path: C:\Program Files\dotnet\sdk\8.0.100-rc.1.23455.8\
.NET workloads installed:
There are no installed workloads to display.
Host:
Version: 8.0.0-rc.1.23419.4
Architecture: x64
Commit: 92959931a3
RID: win-x64
.NET SDKs installed:
3.1.426 [C:\Program Files\dotnet\sdk]
5.0.408 [C:\Program Files\dotnet\sdk]
6.0.317 [C:\Program Files\dotnet\sdk]
7.0.203 [C:\Program Files\dotnet\sdk]
7.0.308 [C:\Program Files\dotnet\sdk]
7.0.401 [C:\Program Files\dotnet\sdk]
8.0.100-rc.1.23455.8 [C:\Program Files\dotnet\sdk]
.NET runtimes installed:
Microsoft.AspNetCore.App 3.1.32 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 6.0.22 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 7.0.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 7.0.11 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 8.0.0-rc.1.23421.29 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 3.1.32 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 6.0.22 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 7.0.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 7.0.11 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 8.0.0-rc.1.23419.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.WindowsDesktop.App 3.1.32 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 6.0.22 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 7.0.5 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 7.0.11 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 8.0.0-rc.1.23420.5 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Other architectures found:
x86 [C:\Program Files (x86)\dotnet]
registered at [HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\x86\InstallLocation]
Environment variables:
Not set
global.json file:
Not found
Learn more:
https://aka.ms/dotnet/info
Download .NET:
https://aka.ms/dotnet/download
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Comments: 15 (11 by maintainers)
This is great.
HttpPipeline.CreateHttpMessagePropertiesScope
is even better (never knew of this before). I can plumb in everything I need.Leaving this POC for posterity: https://gist.github.com/joelverhagen/bbf0bdd91cfcdb5784abf135a859a108
The problem gets nasty with
$batch
so I think I will use aChangeId = <guid>
column which can be used to check on a single property (of a single entity, in the case of$batch
). That’s probably what I’ll do in my application. That concept translates better to blobs as well, where you won’t want to do a full byte-for-byte comparison of the blob contents (where it’s more feasible for column-for-column comparison of a Table entity). I can compare ax-ms-meta-changeid: <guid>
blob metadata there.It would still be nice to have
ToOdataAnnotatedDictionary
as a public extension method for more flexibility in implementation, e.g. it’s very painful to get the property bag from a$batch
request body, but I see there are other ways to avoid that blocker.Thanks for your help. I’ll close this since it appears quite feasible with
HttpPipeline.CreateHttpMessagePropertiesScope
+ a customRetryPolicy
.I’m not sure why I didn’t think of that! Thanks! And yes, that seems works also. Here’s my attempt: https://gist.github.com/joelverhagen/bbf0bdd91cfcdb5784abf135a859a108/revisions#diff-0b69b473fe937040615d69f606751f61ddbc2e3a1849360ff2456c22afe88c0b
Challenges of the design:
HttpMessage.Request
.HttpMessage.Request
to fetch the remote state. It’s kind of gnarly.HttpMessage.Request
If-*
headers… complexResponse
implementations aspublic
types. But it’s a simple class to implement especially for204 No Content
.So the retry occurring inside or outside of the pipeline seems to be a trade-off between which contextual information you want available.
I think this is practical but nasty/complex enough that it should only be done if it’s really important to no-op on these failures in a best effort – which I guess is a decision based solely the dev’s requirements. So of course, I want to implement it 😈
Here is an example of accessing the raw dictionary response:
Thanks @joelverhagen for the additional context!
I think there are bug trade-offs either way. But if your comparison was targeted for the scenario rather than a complete entity comparison, a model comparison seems safe here.
Something else to consider is that the wire format is actually not internal, it’s just a
Dictionary<string, object>
. It’s the transformation to/from the wire format that is internal. If your intent is just to compare (and possibly edit) discrete properties, the dictionary format should be all that you’d need here.But all that said, the only place you’d be able to evaluate and manipulate the request and response would be in a pipeline policy. I think a policy-based solution would be much more complex for limited additional benefit. However, if you decide this is the path you think makes the most sense for your scenario, I’d be happy to look at any implementation ideas you come up with. The easiest path would probably be to supply a custom RetryPolicy and put the logic there.