evcc: If the mqtt broker is no longer reachable, evcc fails
Describe the bug
If evcc is configured to send mqtt data, it will also send errors, and if the error is that the mqtt server is unreachable, the errors will be duplicated on each attempt and it will stop working, and even if mqtt comes back later, it will remain broken and overflow the log.
If evcc is just sending data, it should just keep working when mqtt is down.
If it’s also receiving data for e.g. meters, I’m not sure what should happen. But of course it would be great if evcc would start working again when the mqtt server comes back, even if it takes an hour.
Steps to reproduce
Configure mqtt, start evcc, stop mqtt
Configuration details
mqtt:
broker: localhost:1883
topic: evcc # root topic for publishing, set empty to disable publishing
Log details
[mqtt ] ERROR 2023/09/27 21:48:50 send: evcc/site/error: timeout
[mqtt ] ERROR 2023/09/27 21:48:50 send: evcc/site/error: timeout
[mqtt ] ERROR 2023/09/27 21:48:50 send: evcc/site/error: timeout
later
[mqtt ] ERROR 2023/09/27 21:56:58 send: evcc/site/error: no message IDs available
[mqtt ] ERROR 2023/09/27 21:56:58 send: evcc/site/error: no message IDs available
[mqtt ] ERROR 2023/09/27 21:56:58 send: evcc/site/error: no message IDs available
...more than 100 log lines per second
What type of operating system are you running?
Windows
Version
0.120.2
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Comments: 19 (4 by maintainers)
Another hour later the browser tab completely hung and refreshing it didn’t succeed. Evcc memory consumption still at 110mb and cpu at 30%, so it didn’t go higher. Then I started the mqtt server and after some minutes the error messages ended, UI worked again and it seemed to work. But the values from mqtt didn’t start to update, even after waiting a long time they still showed the values from before stopping mqtt. Browser refresh didn’t help. Then I restarted mqtt a second time and now it updates correctly, so without evcc restart. Memory and cpu also went down. Log file 220mb in 5hours (normal: 2mb).
Fair point. That said, I am not sure if any sending part of the system should see its own messages. We could solve this by adding a source to the messages on the bus, but given the corner case nature of the original issue, I am not sure it is even remotely worth the effort. Let’s merge your PR. 👍
I not convinced by the PR. I see that it gets the immediate problem solved, but it does so by just hiding the underlying conceptual problem.
I have debugged this locally, and I now understand what is happening. The logger that captures the errors for the UI via the multi writer is to blame. It sends any logged errors with key “error” to the uiChan. That however is also the channel that mqtt (and effectively everybody attached to the central Tee) is fed from.
So effectively if MQTT logs an error, it will see its own error message returned to it on the channel it reads from, and will try to send that error out to the topic “evcc/site/error”. This again fails, and so we start piling up messages on that channel. That is the feedback loop I observed.
Is the behavior that errors are published via mqtt even expected? Or is this an unintended side effect of the fact that all components share the same Tee under the hood? And if so, what could be a smart idea to filter these out?
@andig I can reproduce this locally. Looking.