element-web: Stuck notifications
We are experiencing multiple issues in the area of “stuck” notifications and unread markers. Many are related to the way receipts interact with threaded messages.
Symptoms
We are seeing different symptoms of the problem:
- “Stuck” unread dots and notification counters: Even when I have read all messages in a room, the room is still marked as unread or having notifications.
- Unread dots or notification counters returning on app startup: When I restart the app (or refresh the page on Web) rooms which I have already read re-appear as unread.
- Notification counters growing and shrinking spontaneously after entering or scrolling in a room.
Spec-level causes
Message ordering
Fundamentally, in order to interpret the meaning of a receipt that says “I have read everything up to here”, we need to know what order messages are in. This is not clear in the spec, and we propose to make it clear and explicit in MSC4033.
In the meantime, Element Web uses a combination of “sync” order (the implicit order of events arriving via a /sync request) and “timestamp” order (using the ts property within events).
Some of the existing bugs are probably caused by this inconsistency, but it is not clear yet how many: we believe there are also bugs in the implementation that cause additional problems, and this theoretical inconsistency is only the cause of a few problems.
Which thread the root belongs to
The spec has what we consider a bug when it talks about which thread the root message belongs to, which has been reflected in client code, making it inconsistent with the server implementation (at least on the Synapse server). We have a proposal to fix this bug in MSC4037.
Identifying which thread any message is in
It is sometimes difficult for clients to identify which thread an event belongs to, meaning that a receipt pointing to it is sometimes ignored. We have begun drafting MSC4023 to address this.
Other
Previously, we believed that MSC3981 (recursive relations) would solve some of the problems, but since that MSC does not solve the event-ordering problem (because the events from the /relations API are returned in “topological” order) we no longer believe it is important, except as a performance optimisation. Code-level causes
Code-level causes
We have found and fixed several bugs in the Element Web code that were caused by an incomplete understanding of the meaning of threaded and unthreaded read receipts. We anticipate that some more exist.
(We believe that the primary reason why we’re not seeing the same problems on mobile is that the apps persist events they’ve received whereas Element Web has to re-fetch from scratch after every launch. As a result, any issue in the unread state logic, strikes again and again. The apps also use a single timeline whereas Element Web maintains one timeline per thread in addition to the main timeline in every room.)
High-level plan of actions
### Tasks
- [x] Set up POC branch / deployment with threads disabled – https://github.com/vector-im/element-web/issues/25676
- [x] Fix read-receipt behavior around thread roots – https://github.com/matrix-org/matrix-js-sdk/pull/3600
- [x] Fix read-receipt behavior around non-thread relations to thread roots – https://github.com/matrix-org/matrix-js-sdk/pull/3607
- [x] Fix missing message issues due to replies to unknown events – https://github.com/matrix-org/matrix-js-sdk/pull/3615
- [x] Fix unread count returning from zero after reload – https://github.com/vector-im/element-web/issues/25806
- [x] Fix notification counters sometimes being doubled – https://github.com/vector-im/element-web/issues/25803
- [ ] Fix remaining bugs found in no-threads POC as they likely impact the threaded experience as well – https://github.com/vector-im/element-web/issues/25676
- [x] Set up unread & notification test suite – https://github.com/vector-im/element-web/issues/25449
- [ ] Pre-investigate issues listed after the threads model refactoring to see if any of them are fixable before
- [ ] Refactor threads model code and remove serverless / legacy code path to simplify further changes
- [ ] Fix missing notifications – https://github.com/vector-im/element-web/issues/25621
- [ ] Fix incorrect thread reply count – https://github.com/vector-im/element-web/issues/24636
- [ ] Fix notification counters not adding up between spaces and rooms – https://github.com/vector-im/element-web/issues/20372
- [ ] Fix unread counter explosion – https://github.com/vector-im/element-web/issues/25479
- [ ] Fix Zombie notifications from old threads – ? issue ?
- [ ] Fix read receipts / fully read marker jumping backwards – ? issue ?
- [ ] Triage existing issues to increase confidence that the root causes will actually be fixed through the MSCs mentioned above.
We believe a lot of progress can still be made without spec changes. So we’re slightly deprioritising work on the MSCs.
- https://github.com/matrix-org/matrix-spec-proposals/pull/4037 We claim it is implemented in EW Synapse will need small fixes when this is merged
- https://github.com/matrix-org/matrix-spec-proposals/pull/4033 Needs implementation on server and client but we believe it should mostly only cover edge cases
- https://github.com/matrix-org/matrix-spec-proposals/pull/4023 There is disagreement about the way to specify this and possible interference with https://github.com/matrix-org/matrix-spec-proposals/pull/3051
- https://github.com/matrix-org/matrix-spec-proposals/pull/3389 There is likely no good workaround other than implementing this MSC
New issue inbox
The following is a holding area for newly reported issues that require review. Once reviewed, issues should either be moved to one of the other task lists below or, if not applicable, removed from this epic.
### Tasks
- [ ] https://github.com/vector-im/element-web/issues/25331
- [ ] https://github.com/vector-im/element-web/issues/25420
- [ ] https://github.com/vector-im/element-web/issues/25481
- [ ] https://github.com/vector-im/element-web/issues/25480
- [ ] https://github.com/vector-im/element-web/issues/25479
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/21717
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/21724
- [ ] https://github.com/vector-im/element-web/issues/25513
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/21737
- [ ] https://github.com/vector-im/element-web/issues/25482
- [ ] https://github.com/vector-im/element-web/issues/25528
- [ ] https://github.com/vector-im/element-web/issues/25541
- [ ] https://github.com/vector-im/element-web/issues/25621
- [ ] https://github.com/vector-im/element-web/issues/24547
- [ ] https://github.com/vector-im/element-web/issues/25623
- [ ] https://github.com/vector-im/element-web/issues/25642
- [ ] https://github.com/vector-im/element-web/issues/25670
- [ ] https://github.com/vector-im/element-web/issues/25676
- [ ] https://github.com/vector-im/element-web/issues/23976
- [ ] https://github.com/vector-im/element-web/issues/25804
- [ ] https://github.com/vector-im/element-web/issues/25806
- [ ] https://github.com/vector-im/element-web/issues/24111
- [ ] https://github.com/vector-im/element-web/issues/25408
- [ ] https://github.com/vector-im/element-web/issues/25907
- [ ] https://github.com/vector-im/element-web/issues/25904
- [ ] https://github.com/vector-im/element-web/issues/25929
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/22226
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/22363
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/22414
- [ ] https://github.com/vector-im/element-web/issues/25975
- [ ] https://github.com/vector-im/element-web/issues/25984
- [ ] https://github.com/vector-im/element-web/issues/25950
- [ ] https://github.com/vector-im/element-web/issues/26063
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/22526
- [ ] https://github.com/matrix-org/matrix-js-sdk/issues/3665
- [ ] https://github.com/element-hq/element-web/issues/26933
Tasks not blocked by spec work
### Tasks
- [x] https://github.com/vector-im/element-web/issues/24388
- [x] https://github.com/vector-im/element-web/issues/24000
- [x] https://github.com/vector-im/element-web/issues/23991
- [x] https://github.com/vector-im/element-web/issues/23685
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/21667
- [ ] https://github.com/vector-im/element-web/issues/24629
- [ ] https://github.com/vector-im/element-web/issues/25207
- [ ] https://github.com/vector-im/element-web/issues/25196
- [ ] https://github.com/vector-im/element-web/issues/25212
- [ ] https://github.com/vector-im/element-web/issues/10954
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/21641
- [ ] https://github.com/vector-im/element-web/issues/25411
- [ ] https://github.com/vector-im/element-web/issues/25450
- [ ] https://github.com/vector-im/element-web/issues/25449
- [ ] https://github.com/vector-im/element-web/issues/25676
- [ ] https://github.com/vector-im/element-web/issues/25596
Tasks that are related to or dependent on spec work
We’ve written the following MSCs to try and address the root causes in a reliable and performant way:
- https://github.com/matrix-org/matrix-spec-proposals/pull/4033 A client-side workaround for this based on timestamp ordering has been implemented. This is imperfect and easily abused though.
- matrix-org/matrix-spec-proposals#4023
A client-side workaround for this based on calling
/eventto fetch the parent has been implemented. This is functionally correct but has a noticeable performance impact. - https://github.com/matrix-org/matrix-spec-proposals/pull/4037 The client-side code has already been modeled to reflect the behavior proposed in the MSC (which is also what Synapse already does today).
- https://github.com/matrix-org/matrix-spec-proposals/pull/3981 We expect this to not help with the ordering problems but it will be a performance improvement.
### Tasks
- [ ] https://github.com/vector-im/element-meta/issues/1350
- [ ] https://github.com/matrix-org/synapse/issues/15377
- [ ] https://github.com/vector-im/element-web/issues/25021
- [ ] https://github.com/vector-im/element-web/issues/24312
- [ ] https://github.com/vector-im/element-web/issues/24394
- [ ] https://github.com/vector-im/element-web/issues/24595
- [ ] https://github.com/vector-im/element-web/issues/24442
- [ ] https://github.com/vector-im/element-web/issues/25408
- [ ] https://github.com/matrix-org/element-web-rageshakes/issues/21679
- [ ] https://github.com/vector-im/element-web/issues/25482
- [ ] https://github.com/vector-im/element-meta/issues/1714
- [ ] https://github.com/matrix-org/synapse/issues/15701
- [ ] https://github.com/vector-im/element-web/issues/25395
- [ ] https://github.com/vector-im/element-web/issues/25893
Issues that are related but out of scope
- matrix-org/matrix-js-sdk#3325
Time sheeting
WEB: Stuck notifications
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 90
- Comments: 148 (67 by maintainers)
Update
About our stuck notification test cases:
As you can see, we are facing a large drop on our failing tests 🎉. Multiple reasons behind this drop:
develop. We were able to enable a bunch of our skipped/failing tests thanks to this rewrite. We also noticed a large drop of notification/unread issues on develop in our daily usage.We are aware that there are still issues and we have to write more tests but this result is already a big step forward!
Late-arriving update (might please @samim23 😉
We discussed this a little more, and we are quite confident in the fix that went in recently - we think it has made the situation better, and we think it has low risk, so we backported it to our next release candidate, and if all goes to plan it will be included in the release on Tuesday 18th July.
Update for today:
Just for fun, if you’re interested in why this is a difficult problem to fix, you might be interested in our proposed set of test cases that we intend to write to make sure it works as expected. At the time of writing, we have identified 146 cases we need to test, and we are not at all convinced we’ve listed them all. Hopefully this also encourages you that when we have all those test cases, the behaviour of the app in this area will be considerably more stable.
I confess I’ve delayed this week’s update until I could include this graph:
… which shows some of the effects of a lot of work that has been going on behind the scenes. The little drop in the red and green lines on the right-hand side represent a reduction in the number of known stuck notification bugs in the development version of the code. Most of the drop so far comes from a PR ignoring invalid receipts.
We fully expect this to reduce the number of real-world problems people are experiencing. Note: reduce the number, not eliminate (yet).
We also had a slight false-start this week, where we had to revert changes to move deleted messages to the main timeline because they caused a bug we fixed very late in the release cycle. However, we now have a proper fix for this problem, and we expect to “unrevert” the revert just after the next release candidate. (The timing here allows us time to test this change, which is disruptive, on the bleeding-edge deployment at develop.element.io for as long as possible before releasing.)
Meanwhile, we have found some limitations in the way receipts are stored in matrix-js-sdk, which explain some of the other problems we are seeing, especially with “Mark room as read” not working as expected. We are re-writing the receipt-handling code to reflect our new understanding, which won’t be super-fast, but we are hopeful that it will make a big difference.
I know this process is painfully slow when viewed from the outside, but rest assured we are working hard on it, and we are working in an organised and systematic way, using the tests as the source of truth. This means that when we take one step forward, we are less likely to take two steps back and cause new bugs with our fixes. For an explanation of how complex some of the logic we are working through here is, check out my recent Matrix Live video.
Folks, this is a complicated issue spanning multiple months of work and many PRs already, it’s escalated as far as can be. Piling more pressure on top or “how hard can it be, did you try simply doing X?” is unlikely to move things along faster at this point.
Let’s please try and keep this issue focused on progress reports from the team to make it easy for everyone to follow along.
Progress on writing test cases: the number of implemented cases rose from 46 to 70, with the total expected number of cases now at 155.
I do expect the number of cases to double at some point when we allow tests to be run either encrypted or unencrypted mode.
A breakthrough this week: one of the tests uncovered a remaining stuck unread case: “Reading an unread thread after a redaction of the latest message makes it read” so when we make that test pass we know we will be removing real stuck unreads that people experience.
Thanks everyone for bearing with us. We are aware of the problems and, in fact, we do suffer ourselves as we’re all using Element for work. We’re working on resolving this situation with the resources currently available to us. I acknowledge that we haven’t always been good at communicating status externally. We will try to improve this going forward. I’ve updated the issue description above with a summary of the currently known problems and a high-level action plan for our next steps.
I’m going to risk repeating myself but I just want to show my gratitude for staying on top of this monstrously complex problem. It’s been a long slog and starting to look like you’re close to resolving it. Well done.
Update: I have found some incorrect assumptions in the code that stores and retrieves receipts inside matrix-js-sdk, so I am working on a change that will restructure this. I think it can explain some of the incorrect answers about what is read that we have been seeing.
We are continuing work in this area; the release gone out today carries yet more fixes for stuck notifications and yet more are awaiting review for “zombie notifications” (a phenomenon where notifications you have cleared come back on reloading the app). We’re now focusing on creating a swathe of tests to aid in a major refactor to the threads data model which we believe will help with the stuck notifications project but also help fix some threads bugs.
I’d also like to say again how sorry we are for the pain this is causing to lots of people. We are all heavy users of Element Web, and have been feeling the pain too. We are determined to fix this problem properly, not patch over it with kinda-working hacks.
I have some rooms that always show false unread notifications every time I restart Element. I have one room in particular that only has three members but frequently shows incorrect unreads, even without restarting Element. That room probably has the most threads of any of my rooms, but it is showing unread notifications even when no new item is added to a thread. Sometimes “mark all as read” will clear it, sometimes not.
Can someone give an update on what the expected state of affairs is on this? Should we expect things to be fixed? Or is there still ongoing work? Empirically, my element remains absolutely cluttered with stuck notifications.
Since this is a meta-issue on a problem that’s been ongoing for many months, it would be nice to get some updates on this every once in a while.
Update:
Thanks for your response Andy. I understand it’s a tough issue and we all very much appreciate your work on this 👍
I understand where you’re coming from, but this is actually pretty difficult from an outside perspective. There are lots of issues related to this problem, some of them very general, some of them with large discussions, some with just technical jargon focused on fixing the technical issue at hand, and so on. Extracting a broader overview on the state of the problem from that data is not easy to do.
I regularly find myself in a situation where I’m acting as an ambassador of sorts for the ecosystem (as I’m sure many other admins of their local communities do). This particular issue has been a real hotspot on that front, that’s why it would be tremendously helpful in order to give confident answers to users, if there was an update on this the high level issue once in a while.
Hopefully this doesn’t come off as too pushy. We all care about Matrix and want it to succeed ✌️
We are continuing our work on this issue. We are investigating some open issues, and trying to close bugs that we think are already fixed. We have discovered some problems with our handling of unthreaded receipts (which are created when you choose “Mark Room As Unread”) and we’re working out how to tackle them.
We are continuing work on our comprehensive test suite which will give us confidence that we have fixed most of the problems when it is passing.
We are working on MSC4033 because we think some of these issues are difficult or impossible to solve without it.
Commenting here as requested, although not sure if this is useful for fixing the issue. Element version: 1.11.52 (both web and desktop).
Note: This is without logging out or back in - just daily usage. There is no initial sync.
I have about 150 rooms. Maybe 20-30 of them are marked unread for no reason on launch and maybe 1-3 get stuck unread forever.
Hi, I’m really glad they have been helpful. Yes, the team (not sure who yet!) will continue reporting progress here.
Since this issues keeps on lingering and seems hard to fix, would it be a possibility somehow to at least make the “mark as read” work as intended (in channel context menu)? For many month now, I’m constantly being spammed by these random notification. Having one reliable button to say “I’ve read everything!” would be really awesome as quick fix.
77 test cases implemented out of 155, after a slow week with security releases and day-to-day maintenance duties interfering. I’m hoping to make faster progress this week.
We are seeing things improve for quite a few users, but we know some problems remain, specifically in cases where message timestamps disagree with the order of messages as the server sees it, and when dismissing old threads by using Mark Room as Read.
We are continuing to work on these problems. We are writing tests that cover all the problems we have seen. Some of these currently fail, which helps us keep track of what situations still don’t work correctly.
The next step after writing the tests will be to work on a change to the spec: MSC4033. This will remove the existing ambiguities about what order events are in, making the job of deciding which events are read much simpler. We will prototype this change, and if it solves the problems as we expect (the tests will help to know), we will continue to push for it to be included in the Matrix spec and implemented in homeservers.
If anyone reading continues to see problems, and can reproduce them consistently, bugs that describe clearly how to trigger a problem are still appreciated. For those who are able, what would be even better would be a new test within the test file that fails because it triggers the problem.
Thank you again for your patience. I think we have a good plan for how to squash these problems and never see them come back!
Another data point for you: The issue with stuck notification was significantly better about 1 month ago. The past 2 weeks about, things feel like they have gotten worse again. Today even the trick of saying on a channel “mark all as read” stopped working for me, the unread notification simply stay there.
Fun sidenote: One of our channels had a confetti emoji (which triggers this big onscreen animation) somewhere in the distant past. For some reason, this frequently gets flagged as unread and re-triggered, leading to a somewhat confusing confetti party.
Following this thread, i am aware how complex and far-reaching this set of related bugs is, and i salute all the developers ongoing massive efforts to get this resolve - i’m rooting for you all!
113 tests implemented out of 157:
I have some maintenance tasks I need to work on this week, and after that I will start on looking at fixing some of those failures, which should (eventually) result in some of the remaining problems people are seeing being fixed.
Hi @Valodim, thanks for the message - I can see lots of people are still affected by this. You can follow our progress somewhat by looking at the tasks closed in the task lists above, but the situation is:
Rest assured that we are still aware this is an ongoing problem, and we continue to work on it. The reason it’s been so long is that our early work uncovered tons of new issues that were masked by the fact that we had a bug incorrectly marking threads as read when they were not.
I have indeed started seeing the issue of stuck notifications way more frequently in the past few days.
We are continuing with the work of filling out the test suite, although I’ve mostly been fighting vagaries of the CI setup this week 😃. At some point soonish we will probably double the size of the suite by allowing tests to run in both encrypted and unencrypted rooms, since we expect different behaviour in quite a few cases (because the server can’t examine encrypted messages to give an accurate unread count).
I’ve been testing develop.element.io and I can say that it has resolved a large proportion of the “stuck room notifications” issues. Well done!
Just wanted to say thanks to the team (in particular @andybalaam) for communicating like a champ and keeping us in the loop as of late. Can’t speak for others, but it makes a massive difference to me and is rebuilding my faith in the project.
Absolutely. I can’t wait for the day when we can say “it’s fixed!” Until then it’s a bit of a whack-a-mole situation: when we fix something we try to be sure that we are definitely improving the behaviour, and we try to be super-sure we’re not breaking any other fixes, but we don’t usually know which symptoms will go away after a specific fix, so it’s tricky to give proper updates, but we’ll try to do better in future, by posting updates here.
Thank you for your work! I am personally really aware of how painful this has been, and that can’t have made it easy for you as an advocate.
Not at all. Thank you for the feedback.
I’d expect the initial PRs (https://github.com/matrix-org/synapse/pull/15315 & https://github.com/matrix-org/matrix-js-sdk/pull/3248) to land this week.
Has progress stalled? I’ve had the same stuck unread notifications for several months now, both on every production release of Element as well as running bleeding edge from develop.element.io. It does seem to be the same rooms/old messages that consistently show as unread. Is there anything I can do as an end-user to troubleshoot the circumstances, particularly where the issue is reproducible at least for some fraction of the time?
Also, opening chats for the first time after restarting element, when the messages are loaded, marks some of the loaded ancient messages from threads as unread.
After each time I restart Element, I have to go over 10-15 chats, to mark everything as read. A lot of times, the chats are half a year old.
I feel like it’s connected to the fact that messages are being loaded when I click on the conversation, not when I open Element, but that’s just a guess.
106 tests implemented out of 156. We are moving our focus on to figuring out why some of these tests sometimes fail randomly - I suspect some of them are actually real problems in the app. Once the tests are stable, we will work on fixing some of the problems we have already found, before continuing with the test suite.
Sending all the dev working on resolving this issue lots of good luck and fortitude! We run our startup’s chat infra on matrix/element and this bug has lead almost to a team mutiny. Hope it get’s resolved soon!
I hope this gets fixed soon. 😦
My stuck notification is around 20+ now for the past few weeks, which pops out every time my computer reboots.
Update: we will unrevert the fix for moving deleted events to the main thread today, so it will be in the release in 3 weeks. We continue to work on the rewritten receipt storage and expect to have the first change ready this week. It will then need more time to be completed, but we should see further improvements to the experience of stuck unreads next week, if we make the progress we are hoping for.
Update for this week - the redaction-related fixes are merged and released. I am working on fixes that affect editing messages.
This week we have been mostly fighting flaky tests. Many of the flakes are caused by unreliable behaviour by Element when messages are edited or redacted, so may well be linked to real stuck unread cases.
Our next step is to pick some test flakes or failures and fix them, hopefully giving real improvements for those of use experiencing stuck unreads.
Running Element 1.11.36 & Synapse 1.88.0 for 3 days now and so far all the “stuck” notification issues seem to be gone, at least for me. I still get occasional unread badges in rooms even though there are no new messages, but they are no longer stuck or climbing into the hundreds.
Even though my own experience with stuck notifications is terrible, let’s not rush the devs to make it two weeks earlier (we’ve been experiencing this for how long?). We know what happens with releasing stuff too early.
@bhearsum I hear your pain, and I am experiencing it too, but I think we shouldn’t give up on read states working properly. It’s a brilliant feature that if you read a message on your PC it is also marked as read on your phone, so we’re working hard to fix the bugs so that this stuff works properly.
Same here, and as most people who have been closely following the issues related to threads these last few months (since they have been out of beta and sometimes even before), I want to continue advocating for Matrix/Element and support all the good people working on it.
Considering that these issues are various, complex, unpredictable and should take a lot of time and effort until threads work “as expected” (which is in itself quite a challenge, given how user expectations can be subjective!), wouldn’t it be possible to roll them back (disable Threads by default and/or make them a Labs feature again)? I’m thinking it would also release at least some of the pressure to take the time needed to work on this with proper conditions.
Really appreciate all the efforts the dev’s have put into this issue. As intensive matrix/element users, our entire team has been along for the stuck notifications journey. It felt like it was getting better a while back, but recently the amount of stuck notifications has sky rocketed again it seems and the behavior is very erratic, which makes it hard to fill a proper bug report here. One comedy aspect of it: If you had “confetti” anywhere in an old thread, it tends to get re-triggered regularly now, resulting in a spontaneous on-screen confetti party.
I noticed that when “Mark as read” doesn’t work, reacting to a message clears the notification. So when this happens to me (and it still does happen very often) I click on an emoji reaction and remove it right away.
I’m impressed with this thorough engineering approach you’ve taken to solving the problem.
My issues I had are also resolved with this release, thanks a lot for all your effort!
I appreciate all your hard work to get this resolved, thank you so much! I do understand things take time (“change will appear in the release on 1st August”) but frankly still wish it was solved sooner in production (my entire startup team using element is complaining to me daily about it.)
Speaking of which, a change which merged yesterday morning from @t3chguy has made my experience much better today. If you try the latest nightly build or develop.element.io* you may see a decent improvement **.
*: nightly builds and develop are both bleeding-edge and not guaranteed to work at all! If you’d prefer to wait, this change will appear in the release on 1st August.
**: but I still see some more minor problems, so we are definitely not at the end of the road yet.
This change also sparked another spec change proposal: https://github.com/matrix-org/matrix-spec-proposals/pull/4037
For everybody subscribed to this ticket: the backend and frontend PRs have been merged now so I expect a rollout quite soon.
(I’m not an Element/Matrix employee, just an interested user.)
Even this one chat that I couldn’t mark as read for ~ 1 year doesn’t have a notification anymore.
Agree with this. Recent releases of iOS Element have “mark all as read” which I understand to mean mark all messages in all rooms as read, but that doesn’t work reliably either.
I have several rooms with threads that refuse to be marked as read. One thing I noticed with this is that the last message in each thread has been edited.
Just to point out that Element Desktop for Mac (and probably its other siblings) version
1.11.51behaves way better with respect to notifications, so your effort was not in vain.@andybalaam your recurrent reports on how things are moving are hugely appreciable. Will the team keep up those reports when you move on?
Correction - the redaction-related fixes are merged and included in today’s release candidate, which will be released next week.
I hope this will be fixed soon. Since this is the thread where all (new) issues are linked to, I will add some observations. I don’t know, if all these are covered by test cases yet.
Notifications are stuck if:
Sometimes notifications reappear in old threads.
“Mark room read” does not solve the problem - sometimes not even for the moment, but if so, after reloading the notification return.
Recently it’s not only the “bold” state that appears, but also “red” with an apparently random number, often quite large.
I cannot tell currently, whether only encrypted rooms are affected, but may be.
(Synapse: 1.92.3, Element 1.11.42 via Docker, as well as (Synapse: 1.93.0, Element 1.11.44 via Docker)
Here’s a glimpse on where we’re at with completing the test suite:
The topmost task list in the issue description has been updated with our latest plan of attack. Following on from the impactful fixes we’ve made in the past two weeks, we believe there are more code-level problems we can solve and, thus, are slightly deprioritizing pushing through on the MSCs.
I second this, the situation has much improve! There are occasional weird badges for sure, but that is manageable. Thanks to all the devs for the hard work on this!
Maybe this is a crazy idea, but given the gravity of the situation here (folks are switching away), how about reverting the fix for the issue that exposed all of this? It seemed to work well enough before.
Keep it in a branch, and then get all of the follow-up issues working in there before merging. What do you all think?
@ExplodingWaffle we can’t currently reliably mark threads as read due to the technical issues explained in the issue description. While I empathize with some of your thoughts, this is also not the right place to discuss thread-related product changes. We’re purely concerned with making notifications not stick after you’ve read them here.
Sigh, I thought the latest stable update’s https://github.com/matrix-org/matrix-react-sdk/pull/10730 will somewhat solve the stuck notification issue until this fix is released. It didn’t.
No matter how many times I click
Mark as Reada single reboot will just bring the stuck notification back.Hopefully, this will release soon. I got 8+ stuck notifications on 3 Spaces.
I agree with the other posters about it getting worse. I have switched to https://develop.element.io and there most (all?) issues appear to have gone away.
https://github.com/matrix-org/matrix-js-sdk/pull/4022 seems to have removed a large number of bugs. Well done, @florianduros .
Just to give a status update, the issues have worsened significantly over the last few versions. Way more unread messages stay unread than ever before, they do not get cleared automatically when opening a chat or a thread, they do not sync properly between devices (way worse than before), and they often are not able to be cleared manually (works IMHO better on mobile/iOS than on desktop/Mac).
That is great news. I know there are more big changes coming, and the bleeding-edge version is working really well for me today, so I am hopeful that we are turning a corner.
Update on this issue: the graph looks mostly unchanged this week, but we have started work on fixing some of the issues made clear by failing tests. This PR: matrix-js-sdk#3798 fixes several redaction tests, and will be merged after the release candidate branch has been made, to allow us maximum time to test in practice whether it makes things better or worse.
Contributions welcome, but I doubt there’ll be any bandwidth from the team to work on anything other than the final solution
Already in Settings > Notifications
I guess this explains why threads have become particularly noisy recently 😅. I appreciate that it’s important to get everything working as it should, but there are a few buttons/behaviours that I imagined would be there that would help immensely, and would probably be easier (quicker) to implement: in order from “expected feature” to “bandaid fix”:
Personally I’ve gone from checking matrix whenever there is a notification to checking it whenever I get bored- while I imagine this looks good on my usage stats, it’s murdering my productivity in the same way Reddit and Twitter do 😁. I don’t like complaining, but would very much like to see this fixed so Matrix can be restored to it’s rightful status as awesome and productive chat app 😃
We’ve completed initial implementations for it last week and are now starting to prepare for testing it. Unfortunately, it’ll still take some more time to get this landed due to the nature of the spec process.
Hopefully we’re closer to the release that will fix this bug.
I’m now with 33 stuck notifications 😦
We’re focusing on the most prominent problems first which are in the interplay of threads and other relations. There may be further issues beyond that.
One more thing that I’ve noticed, is a lot of times the ancient messages are stuff like
Alice invited Bob, orAlice changed the power level of Bob from Default to Admin, orAlice made the room invite only. Happens far more often in comparison to active chats, where people actually send messages.@iyanmv https://github.com/element-hq/element-web/issues/25449 we closed the issue about the test cases. We assume that we fixed the existing tests and we are satisfied with the current tests. Of course, when we are fixing bus around notifications we are adding new tests.
I’ve been using develop.element.io for quite a while now and I can also attest that it’s getting better and better. I can’t wait for this to be completely fixed so I can convince more people to join my little community of friends. 😃
Keep up the good work!
The fixes of the last weeks are available on develop.element.io. They are fixing the last case of stuck unread and a lot of zombie notifications.
Some cases of zombie notifications are remaining, clearly less than before but still annoying! We will investigate on it 😃
I am experiencing some strange issue with thread notification:
Shall I wait for this issue to be fixed ? Give more details here ? Create another issue ?
We just merged a significant change for stuck unreads that should mean if you use develop.element.io or the Nightly build you see fewer rooms that are wrongly shown as unread from tomorrow. Please bear in mind that there may be a one-off increase in unreads, before the problem starts to improve. Do let us know if it doesn’t get better!
Yes indeed I do; sorry. New ticket is https://github.com/vector-im/element-web/issues/26475 .
Hmm. Maybe that chart is misleading - those failing tests are manually marked as failing. The reason the number went up is either:
So any correlation between the numbers rising and the number of actual issues is almost certainly coincidental I’m afraid.
Work continues this week on improving and fleshing out the comprehensive test suite. Next on the list after that is working on MSC4033.
I have a single notification from someone who
@tagged me. The notification shows up in the browser tab icon as well as in the Notifications menu. I’ve viewed the message but the notification is still there. Is this a stuck notification, or is there some way to clear it from the Notifications menu?Edit: Turns out I just had to right-click the room (under Rooms in the left side-panel) and click “Mark as read”. Leaving this here in case it’s helpful for someone else.
It’s definitely gotten a lot less common, but just caught it another time, this time it’s properly stuck (not even reacting and removing the reaction again helps): #25846
Hmm, not for me (using 1.11.35, electron version)
We’ve also had the problem with persistent or recurring notifications for awhile. What I noticed is that it only affects edited messages with mentions. Maybe it helps.
yes - I’ve tried different things.
Then I wrote a message into the affected rooms (I think I did some of the steps above before). After that the rooms where fixed even after a restart of the client.
Thanks @leonardehrenfried. We will most likely be enabling it on beta.matrix.org as to not impact regular users but you can sign-up for a separate account there if you want to help testing.
@8227846265 the threads-related stuck notification issues are blocking on https://github.com/matrix-org/matrix-spec-proposals/pull/3981.
I don’t think that is true. If you look at the comments in vector-im/element-web#24595, you will find some from rooms where threads were never used and IME clearing cache often fixes these issues.