taskwarrior: 3.0.0 sync with GCloud suddenly stopped working - "Failed to synchronize with server"
Set up syncing with google cloud by following the steps outlined in man task-sync
after upgrading from 2.6.2 -> 3.0.0. Verified that it was working and have been using this solution for syncing for a couple weeks now across 3 devices. Have performed successful syncs on all 3 of these devices over the past couple weeks. Everything was working as expected until today for some reason.
Attempted to run task sync
as normal. Got an error message saying “Failed to synchronize with server”. Fair enough, I usually forget to run gcloud auth login
before attempting a sync anyways. Successfully authenticate in the browser window that opens using the same google account I’ve always used. Run task sync
again, same generic error message “Failed to synchronize with server”.
- I have not changed anything about my Taskwarrior config on this device or any other device since the initial sync setup
- Haven’t modified anything in Google Cloud
- I only use Google Cloud to sync with Taskwarrior and have no other storage buckets, roles, or service accounts in this project other than the ones I made by following the instructions in
man task-sync
- Highly doubt I’ve synced anywhere near enough to hit any quota limits in my free tier of Google Cloud; my task database is quite small (115 total tasks) and have synced less than a dozen times
- Verified that the JSON keys file is unmodified compared to when my syncing was last working, it’s still in my
task
directory as expected, and the full absolute path + file name matches the value of mysync.gcp.credential_path
taskrc property - Verified that the correct project is selected via
gcloud config get-value project
- Verified that the storage bucket still exists; I can still see all the task data in there:
Output of task diag
:
task 3.0.0
Platform: Linux
Compiler
Version: 13.2.1 20230801
Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
Compliance: C++17
Build Features
Commit: 3e41fb604
CMake: 3.29.1
libuuid: libuuid + uuid_unparse_lower
Build type:
Configuration
File: /home/joseph/.config/task/taskrc (found), 1890 bytes, mode 100644
Data: /home/joseph/.config/task (found), dir, mode 40755
Locking: Enabled
GC: Enabled
$VISUAL: nvim
Hooks
System: Enabled
Location: /home/joseph/.config/task/hooks
Active: on-add.990.annotate-jira-bot-links (executable)
on-modify.990.annotate-jira-bot-links (executable)
on-modify.timewarrior (executable)
Inactive:
Tests
Terminal: 276x61
Dups: Scanned 115 tasks for duplicate UUIDs:
No duplicates found
Broken ref: Scanned 115 tasks for broken references:
No broken references found
At a loss at how to debug this; Taskwarrior only gives me a generic failure message when sync fails.
About this issue
- Original URL
- State: open
- Created 2 months ago
- Reactions: 1
- Comments: 22 (12 by maintainers)
Nice work! I set up a service account to replicate this, without the
storage.objects.delete
permission, and sure enough it works for a few syncs but then stops when it comes time to overwritelatest
.However, I don’t reproduce exactly the error you’ve seen – a newer version in
sync_meta
than inlatest
. Instead, I seewhich before #3411 would have just said “Failed to synchronize with server”. My bucket has
so two versions: faeb7f followed by fb1e11. The
latest
file containsfaeb7f..
, so it wasn’t updated. Andselect * from sync_meta
gives the same, so it also wasn’t updated. So, all of this failed in an expected way that wouldn’t lead to OutOfSync. However, if I runtask sync
again:It works?! And even more bizarre,
sync_meta
gets updated tofb1e11..
.Ah, I see what’s happening here! On the last (successful) run of
task sync
, the replica begins by trying to update to the latest version from the server. So, it looks for the child version of faeb7f and finds fb1e11, and applies that locally. But it shouldn’t - that version is not on the chain fromlatest
!So, there are two bugs here now.
latest
Any chance you want to make a PR for the second one?
Thanks so much for your patience tracking this down. That’s three bugs found and soon to be fixed in one issue!
Damn, got the same error again:
gcloud bucket appears unchanged since the last screenshot (made sure to refresh too). I DID NOT end up trying with my other devices like I said I was going to at the end of my last comment, by the way.
It looks like it successfully uploaded a new version, and subsequent runs had nothing new to upload so didn’t fail. Maybe try modifying a task and running
./task sync
(with the local build) again?Huh, so the table is actually
which means that the replica uploaded ec3c99, but somehow
latest
didn’t get updated, yet it still stored its version in thesync_meta
table.Here’s the code that handles updating
latest
: https://github.com/GothenburgBitFactory/taskwarrior/blob/ef9613e2d610d81bf3ea88eaefc51ad188707773/taskchampion/taskchampion/src/server/cloud/server.rs#L371-L386 So, if thatcompare_and_swap
operation fails, it deletes the object from the bucket and returnsAddVersionResult::ExpectedParentVersion
with the value fromlatest
. Here’s the code that callsadd_version
: https://github.com/GothenburgBitFactory/taskwarrior/blob/ef9613e2d610d81bf3ea88eaefc51ad188707773/taskchampion/taskchampion/src/taskdb/sync.rs#L82-L113 That updates thebase_version
insync_meta
if it getsAddVersionResult::Ok
, or tries again if it getsAddVersionResult::ExpectedParentVersion
.So presumably there was an attempt to upload new version ec3c99 which
latest
base_version
insync_meta
I think what that means is that
compare_and_swap
failed to updatelatest
but didn’t return false. That’s implemented here: https://github.com/GothenburgBitFactory/taskwarrior/blob/ef9613e2d610d81bf3ea88eaefc51ad188707773/taskchampion/taskchampion/src/server/cloud/gcp.rs#L105-L167 and looking at all the placesOk(..)
occurs there (the return values), those are allfalse
before the call toupload_object
, and thenfalse
if that failed with a 412 error. Theupload_res?
statement there should handle any other error as an actual error, propagated back and causingtask sync
to fail. Looking at the implementation ofis_http_error
suggests that even the 412 is represented asResult::Err
so I think that’s a valid assumption.So, I’m stumped – what have I missed?
Thanks for your patience, @jcoffa. The good news is, the fix here is pretty easy. Just put
ec3c99a195cd4ab09f950c61cf249807
(with no newline!) in thelatest
file in your bucket and things should start working again.Are you able to recompile from source? If so, I can make a nice patch to hopefully get better debugging info. I filed #3411 to track the error message problem.