dendrite: State migration fails while rewriting snapshots after upgrading to v0.4.0
Background information
- Dendrite version or git SHA: v0.4.0
- Monolith or Polylith?: Monolith
- SQLite3 or Postgres?: Postgres
- Running in Docker?: No
go version: 1.13.8
Description
I’m not sure how to interpret this error. Clearly the state storage upgrade did not complete, despite the log saying it did.
Jul 15 22:54:01 dendrite-host dendrite-monolith-server[169231]: time="2021-07-16T03:54:01.897195913Z" level=warning msg="Rewriting snapshots 3000-3100 of 21375..." func="UpStateBlocksRefactor\n\t" file=" [2021041615092700_state_blocks_refactor.go:140]"
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: time="2021-07-16T03:54:02.667781275Z" level=warning msg="State storage upgrade complete" func="UpStateBlocksRefactor\n\t" file=" [2021041615092700_state_blocks_refactor.go:178]"
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: time="2021-07-16T03:54:02.679749124Z" level=panic msg="failed to connect to room server db" func="NewInternalAPI\n\t" file=" [roomserver.go:54]" error="RunDeltas: Failed run migration: ERROR 2021041615092700_state_blocks_refactor.go: failed to run Go migration function func(*sql.Tx) error: cannot xref null state block with snapshot 13936: sql: no rows in result set"
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: panic: (*logrus.Entry) (0x1261380,0xc00011e230)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: goroutine 1 [running]:
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).log(0xc00011e1c0, 0x0, 0xc000198240, 0x23)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:259 +0x2e2
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).Log(0xc00011e1c0, 0x0, 0xc000271b88, 0x1, 0x1)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:285 +0x86
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).Logf(0xc00011e1c0, 0x0, 0x1297d7d, 0x23, 0x0, 0x0, 0x0)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:330 +0xe2
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus.(*Entry).Panicf(...)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/sirupsen/logrus@v1.8.0/entry.go:368
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/matrix-org/dendrite/roomserver.NewInternalAPI(0xc00011c630, 0x146f860, 0xc0002a1cb0, 0xc0000c7580, 0x149ad00)
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/matrix-org/dendrite@/roomserver/roomserver.go:54 +0x237
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: main.main()
Jul 15 22:54:02 dendrite-host dendrite-monolith-server[169231]: github.com/matrix-org/dendrite@/cmd/dendrite-monolith-server/main.go:86 +0x27c
Jul 15 22:54:02 dendrite-host systemd[1]: dendrite.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jul 15 22:54:02 dendrite-host systemd[1]: dendrite.service: Failed with result 'exit-code'.
Steps to reproduce
- Upgrade Dendrite to v0.4.0
- Dendrite attempts state migration on start
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (9 by maintainers)
Commits related to this issue
- Ensure all create events have a snapshot NID of 0 Fixes #1924 for postgres users, though the underlying cause of why they aren't 0 in the first place is unresolved. — committed to matrix-org/dendrite by kegsay 3 years ago
- Ensure all create events have a snapshot NID of 0 (#1961) Fixes #1924 for postgres users, though the underlying cause of why they aren't 0 in the first place is unresolved. — committed to matrix-org/dendrite by kegsay 3 years ago
- Add more logs To help debug the migration issue in #1924 along with manual data-loss-inducing fixes. Also log the origin server on processed txns to help debug buggy server origins. — committed to matrix-org/dendrite by kegsay 3 years ago
- Add more logs (#2005) * Add more logs To help debug the migration issue in #1924 along with manual data-loss-inducing fixes. Also log the origin server on processed txns to help debug buggy serve... — committed to matrix-org/dendrite by kegsay 3 years ago
- Release/v0.5.0 (#6) * Update MSC2946 implementation for stable spaces (#1859) Now that MSC1772 passed FCP its identifiers have stabilised This outright drops support for experimental spaces but t... — committed to globekeeper/dendrite by PiotrKozimor 3 years ago
- Upstream merge (#9) * Update MSC2946 implementation for stable spaces (#1859) Now that MSC1772 passed FCP its identifiers have stabilised This outright drops support for experimental spaces but t... — committed to globekeeper/dendrite by PiotrKozimor 3 years ago
Okay, so I can modify this check to not error out if there are no events. I don’t know how the database got into this state (as it implies you have snapshots which aren’t referenced by any events), but that shouldn’t be fatal. Expect a patch on Monday on
masterif you don’t mind experimenting with it (back up your database first!).I’ll look into this this week and push some logs to
master. I’ve been on holiday for a bit.I have the same issue.
(And when I was trying to upgrade to v0.4.0, I had the same issue as in the original post)
I ran the command and the output is empty: