dendrite: State migration fails while rewriting snapshots after upgrading to v0.4.0
Background information
- Dendrite version or git SHA: v0.4.0
- Monolith or Polylith?: Monolith
- SQLite3 or Postgres?: Postgres
- Running in Docker?: No
go version
: 1.13.8
Description
I’m not sure how to interpret this error. Clearly the state storage upgrade did not complete, despite the log saying it did.
Jul 15 22:54:01 dendrite-host dendrite-monolith-server[169231]: time="2021-07-16T03:54:01.897195913Z" level=warning msg="Rewriting snapshots 3000-3100 of 21375..." func="UpStateBlocksRefactor\n\t" file=" [2021041615092700_state_blocks_refactor.go:140]"
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: time="2021-07-16T03:54:02.667781275Z" level=warning msg="State storage upgrade complete" func="UpStateBlocksRefactor\n\t" file=" [2021041615092700_state_blocks_refactor.go:178]"
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: time="2021-07-16T03:54:02.679749124Z" level=panic msg="failed to connect to room server db" func="NewInternalAPI\n\t" file=" [roomserver.go:54]" error="RunDeltas: Failed run migration: ERROR 2021041615092700_state_blocks_refactor.go: failed to run Go migration function func(*sql.Tx) error: cannot xref null state block with snapshot 13936: sql: no rows in result set"
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: panic: (*logrus.Entry) (0x1261380,0xc00011e230)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: goroutine 1 [running]:
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).log(0xc00011e1c0, 0x0, 0xc000198240, 0x23)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:259 +0x2e2
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).Log(0xc00011e1c0, 0x0, 0xc000271b88, 0x1, 0x1)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:285 +0x86
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).Logf(0xc00011e1c0, 0x0, 0x1297d7d, 0x23, 0x0, 0x0, 0x0)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:330 +0xe2
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).Panicf(...)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:368
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/matrix-org/dendrite/roomserver.NewInternalAPI(0xc00011c630, 0x146f860, 0xc0002a1cb0, 0xc0000c7580, 0x149ad00)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/matrix-org/dendrite@/roomserver/roomserver.go:54 +0x237
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: main.main()
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/matrix-org/dendrite@/cmd/dendrite-monolith-server/main.go:86 +0x27c
Jul 15 22:54:02 dendrite-host systemd[1]: dendrite.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jul 15 22:54:02 dendrite-host systemd[1]: dendrite.service: Failed with result 'exit-code'.
Steps to reproduce
- Upgrade Dendrite to v0.4.0
- Dendrite attempts state migration on start
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (9 by maintainers)
Commits related to this issue
- Ensure all create events have a snapshot NID of 0 Fixes #1924 for postgres users, though the underlying cause of why they aren't 0 in the first place is unresolved. — committed to matrix-org/dendrite by kegsay 3 years ago
- Ensure all create events have a snapshot NID of 0 (#1961) Fixes #1924 for postgres users, though the underlying cause of why they aren't 0 in the first place is unresolved. — committed to matrix-org/dendrite by kegsay 3 years ago
- Add more logs To help debug the migration issue in #1924 along with manual data-loss-inducing fixes. Also log the origin server on processed txns to help debug buggy server origins. — committed to matrix-org/dendrite by kegsay 3 years ago
- Add more logs (#2005) * Add more logs To help debug the migration issue in #1924 along with manual data-loss-inducing fixes. Also log the origin server on processed txns to help debug buggy serve... — committed to matrix-org/dendrite by kegsay 3 years ago
- Release/v0.5.0 (#6) * Update MSC2946 implementation for stable spaces (#1859) Now that MSC1772 passed FCP its identifiers have stabilised This outright drops support for experimental spaces but t... — committed to globekeeper/dendrite by PiotrKozimor 3 years ago
- Upstream merge (#9) * Update MSC2946 implementation for stable spaces (#1859) Now that MSC1772 passed FCP its identifiers have stabilised This outright drops support for experimental spaces but t... — committed to globekeeper/dendrite by PiotrKozimor 3 years ago
Okay, so I can modify this check to not error out if there are no events. I don’t know how the database got into this state (as it implies you have snapshots which aren’t referenced by any events), but that shouldn’t be fatal. Expect a patch on Monday on
master
if you don’t mind experimenting with it (back up your database first!).I’ll look into this this week and push some logs to
master
. I’ve been on holiday for a bit.I have the same issue.
(And when I was trying to upgrade to v0.4.0, I had the same issue as in the original post)
I ran the command and the output is empty: