syncstorage-rs: Seeing spanner 20k mutation limit errors for inserts in production
I have a large collection of bookmarks(3078 bookmarks and 546 folders) that all of a sudden is failing to sync in production.
Sync id: 130387415
. Via bob, I got the following from the production logs:
A database error occurred: RpcFailure(RpcStatus { status: RpcStatusCode(3), details: Some("The transaction contains too many mutations. Insert and update operations count with the multiplicity of the number of columns they affect. For example, inserting values into one key column and four non-key columns count as five mutations total for the insert. Delete and delete range operations count as one mutation regardless of the number of columns affected. The total mutation count includes any changes to indexes that the transaction generates. Please reduce the number of writes, or use fewer indexes. (Maximum number: 20000)") }), status: 500 }
Here’s a link to a sample anonymized collection of this same size.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 25 (25 by maintainers)
Commits related to this issue
- fix: correct max_total_records reduce to 1999 (19990 mutations) accounting for batch commit's extra mutations: touch_collection (+1) and batch delete (+2: for delete + BsoExpiry update) schema: don'... — committed to mozilla-services/syncstorage-rs by pjenvey 5 years ago
- fix: optimize batch commit mutations write the pending bsos immediately when committing a batch instead of queueing them in the batch_bsos, so we don't pay twice for their mutations also - optimize... — committed to mozilla-services/syncstorage-rs by pjenvey 5 years ago
- feat: add temporary sentry tags for the mutation limit issue a pile of of hacks which we'll hopefully revert shortly Issue #333 — committed to mozilla-services/syncstorage-rs by pjenvey 5 years ago
- fix: lower max_total_records per batch_commit_update costs and remove the potential, redundant touch_collection on batch commit recalculating the mutations (see adca8d67): - 4 for touch_collection:... — committed to mozilla-services/syncstorage-rs by pjenvey 5 years ago
We can at least confirm the recurring sentry events stop on stage/prod. So far so good (they’ve stopped), but let’s give prod some more time just in case.
@tublitzed let’s get #377 deployed and confirmed on Mon. then confirm the new limit over to https://github.com/mozilla-mobile/firefox-ios/issues/5896
Thanks! That narrowed it down.
The final commit includes the last 99 items of the batch in the same request. The handler first writes the 99 additions, then commits all 1999. Which means the request writes more like 1999 + 99 items, blowing it over the limit.
We’ll need further limit adjustments or possibly an improvement to how final commits handle this situation
@tublitzed We made it a config value. https://github.com/mozilla-services/syncstorage-rs/pull/319/files I use a config file here because I couldn’t figure out a way to specify the sub value as an environment var. I believe that @pjenvey noted a way that it could be done.
We also enforce the hard spanner limit (not as a config, because it’s a hard limit) https://github.com/mozilla-services/syncstorage-rs/pull/324/files
Batch commit also deletes the batch upon completion, adding an extra mutation or two. So our max_total_records should probably be 1998.
We need a db_test ensuring this limit, it should create 2 batches: of the max size and max size + 1, ensuring commit returns what we expect.
If it’s longer running it could be disabled by default w/ #[ignore] but we have no spanner on CI to ensure it doesn’t fail over time 😦
The 20k here translates into 2000 items (1 mutation per each column. bsos table has 8 columns + 2 extra mutations for our secondary indexes).
It looks like the sync client will print the server’s limits in its log, can you search your log for “max_total_records”? Let’s confirm your client’s seeing a value of “2000”.
E.g. (from my log, probably non-spanner current sync prod):
1573781473132 Sync.Engine.Tabs TRACE new PostQueue config (after defaults): : {"max_request_bytes":2101248,"max_record_payload_bytes":2097152,"max_post_bytes":2101248,"max_post_records":100,"max_total_bytes":209715200,"max_total_records":10000}