addons: Mosquitto 5.1.1 is broken.
The problem
Environment
- Add-on with the issue: Mosquitto broker
- Add-on release with the issue: 5.1.1
- Last working add-on release (if known): 5.1
- Operating environment (OS/Supervised): hassio
Problem-relevant configuration
Traceback/Error logs
Additional information
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 23
- Comments: 158 (10 by maintainers)
Made quite a bit of progress today, but I’m far from happy with the result at this point. Will continue my journey in the next couple of days.
Reserved my Thursday for looking into this add-on. Assigned myself.
Status update time 🕐
I’ve created quite a few versions of the add-on the past week and ended up with the simplest of them all… (NOTE TO SELF: Nothing beats KISS 😓 )
I’m happy with the result, it seems to perform better. Currently got Z-Wave JS 2 MQTT, Home Assistant, a local mosquitto_sub subscribed to all topics and MQTT explorer connected; Around 25 ESPChip on ESPHome sending all a lot of status updates each second. Additionally opened up a script that tries to send in updates as fast as possible (while disconnecting/connecting between each update).
This is how fast it is now, each line in the screenshot below is a client that publishes a time (with microseconds) as fast as possible while re-connecting between each update:
I’ll leave this all running overnight to see how it holds and do a final round of reviewing my own code in the morning.
Morning update: Everything still alive and kicking.
@frenck I’ve come up with a slightly less abrasive option for a similar comment. It’s a bit more verbose but feel free to copy and paste it. The goal is to avoid discouraging good faith participation while still adhering to your preferred workflow. While it’s not as efficient to add a buffer for the emotional state of the reader it can be considered a “social optimization” to add a bit more explanation.
"If this isn’t resolved or there’s a new problem please create a new issue for each individual thing so it can be tracked and looked at.
We have a workflow that involves opening a new issue instead of reviving old ones; this greatly helps with our tracking and organizing. Re-opening an issue or combining multiple subjects into a single issue adds complexity to the tracking."
I know OP has left zero information here and it will probably be closed out, but on a surface level I have to agree. Something has gone wrong with this point update.
I get my whole Home Assistant instance go down overnight. When I try to access it over the network I cant connect.
After a week of troubleshooting I have narrowed it down to my Mqtt server. What seems to be happening is it loses connection to HA and throws the whole thing off. HA is still running and inaccessible. I have to power cycle to get it up and running again and when I do, I get errors in HA logs about “timed out waiting for mid 2” for Mqtt, and errors in mosquitto about being unable to find HA.
I also find my Zigbee2mqtt devices are found at startup then immediately lost again (presumably when the time out occurs) and have to restart Zigbee2mqtt a couple of times to get it running again.
I wish i could prove that the HA hang is due to mosquitto but since it’s inaccessible (but still running) I can’t. What I can say is that in Recorder I can see the only devices that are unavailable are Zigbee2mqtt and this drops because of mosquitto (there’s zero errors in the Zigbee2mqtt logs in debug mode).
I’ve restored an old backup of the previous version of Mosquitto to try and combat the issue. I’ll try to remember to report back.
Edit: I think the same issue is #1817 and #1814
Also this one https://github.com/home-assistant/core/issues/45036
I’m happy to report things are looking a lot more responsive now that I’ve upgraded to 6.0.1, the errors in the logs and laggy switch/light responses seem to be gone.
Many thanks @frenck
Confirming same. Very strange, sluggish system recently with hundreds of lines similar to
2021-02-27 07:18:29 ERROR (MainThread) [homeassistant.components.mqtt] Timed out waiting for mid 98in home-assistant.log. After rolling Mosquitto back to 5.1, no more errors in the log, performance back to normal.
For those who want to rollback 5.1:
As one person mentioned, enough is enough! Agreed! Decided to stop using all the critical addons with “magic” from HA dev team. It is all the time gambling will it work after any update. Setup a cluster of three VerneMQ instances with HAProxy as a load balancer.
The issue is that socat (which serves as auth point) is being continuously executed again and again!
Yes working perfectly and fast! Thanks @frenck !
Thank you for the eloquent response! I did end up getting the 5.1 installed (forked the addons and downgraded the version) Unfortunately it didn’t fix the issue I was experiencing. So I’ve taken your advice! Backup and reinstall. Thank you
@DeviousVon You say “download an old version and starting over.” I’m assuming you mean Home Assistant? Note that the Mosquitto broker, the component with the problem we’re discussing here, is an add-on to Home Assistant, not Home Assistant itself. Even if you can download an old version of HA, you’d still need to be able to download and install an older version of the add-on (5.1 or earlier).
Different topic, “starting over.” If you do somehow manage to fix the problem by “downloading an old version,” it’s not necessary to start over. Just make sure you back up everything first. You’ll need access to Home Assistant’s filesystem – I use SMB.
Sounds like a lot of trouble, but it’s really not. Takes 5-10 minutes.
So you’ll have the .tar file as well as individual .yaml (etc.) files once you’re done. After re-installing Home Assistant, first try copying the large .tar file to the same backup folder where you found it and doing a restore in Home Assistant: Supervisor | Snapshots. If anything fails or you otherwise have problems, you’ll have the manually-backed-up files to turn to. Just more or less reverse the steps you used to back up everything.
It’s not necessary to start over. You can easily go from brand-new install to fully restored config in well under an hour.
Enough is enough. Either take 5.1.1 down or fix it! At the very least provide a back out or workaround method that doesn’t require going back to a snapshot.
----- Original Message -----
Same here only rollback from old snapshot restored the mqtt function
Would be a nice feature to get a version dropdown by the addons so a rollback would be easier for everyone
Thank you @frenck - great job!
Yes it is working again, well done!
After a couple hours on 6.0.1, I’m happy to report zero slow down issues. Very quick response.
I did some quick testing and initial filing is that it works as it should now. MQTT client connects really fast and gets all it’s topic at the same time.
Motion and button responses are also way faster then before, the same as with deprecated mqtt client from community addons.
There are also no reconnect issues in home assistant with MQTT.
Great work @frenck
I have noticed MQTT being slow on both a Raspberry Pi3B+ and a VirtualBox virtual machine (same snapshot) Something must be wrong with MQTT as between a flick of a switch in the dashboard and the actual toggle of the lights there are sometimes 2 to 5 seconds, but then, after the first time, it is instantaneous I don’t know if other errors in my log are relevant, but I’ll leave them anyway
2021-05-04 09:32:15 WARNING (MainThread) [homeassistant.components.mqtt] No ACK from MQTT server in 10 seconds (mid: 47) 2021-05-04 09:32:15 WARNING (MainThread) [homeassistant.components.mqtt] No ACK from MQTT server in 10 seconds (mid: 48) 2021-05-04 09:32:15 WARNING (MainThread) [homeassistant.components.mqtt] No ACK from MQTT server in 10 seconds (mid: 49) 2021-05-04 09:32:15 WARNING (MainThread) [homeassistant.components.mqtt] No ACK from MQTT server in 10 seconds (mid: 50) 2021-05-04 09:32:15 WARNING (MainThread) [homeassistant.components.mqtt] No ACK from MQTT server in 10 seconds (mid: 51) 2021-05-04 09:32:15 WARNING (MainThread) [homeassistant.components.mqtt] No ACK from MQTT server in 10 seconds (mid: 52) 2021-05-04 09:32:16 ERROR (MainThread) [aiohttp.server] Error handling request Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/aiohttp/web_protocol.py", line 422, in _handle_request resp = await self._request_handler(request) File "/usr/local/lib/python3.8/site-packages/aiohttp/web_app.py", line 499, in _handle resp = await handler(request) File "/usr/local/lib/python3.8/site-packages/aiohttp/web_middlewares.py", line 119, in impl return await handler(request) File "/usr/src/homeassistant/homeassistant/components/http/security_filter.py", line 56, in security_filter_middleware return await handler(request) File "/usr/src/homeassistant/homeassistant/components/http/request_context.py", line 18, in request_context_middleware return await handler(request) File "/usr/src/homeassistant/homeassistant/components/http/ban.py", line 74, in ban_middleware return await handler(request) File "/usr/src/homeassistant/homeassistant/components/http/auth.py", line 135, in auth_middleware return await handler(request) File "/usr/src/homeassistant/homeassistant/components/http/view.py", line 131, in handle result = await result File "/usr/src/homeassistant/homeassistant/components/websocket_api/http.py", line 43, in get return await WebSocketHandler(request.app["hass"], request).async_handle() File "/usr/src/homeassistant/homeassistant/components/websocket_api/http.py", line 142, in async_handle await wsock.prepare(request) File "/usr/local/lib/python3.8/site-packages/aiohttp/web_ws.py", line 135, in prepare payload_writer = await super().prepare(request) File "/usr/local/lib/python3.8/site-packages/aiohttp/web_response.py", line 378, in prepare return await self._start(request) File "/usr/local/lib/python3.8/site-packages/aiohttp/web_response.py", line 386, in _start await self._write_headers() File "/usr/local/lib/python3.8/site-packages/aiohttp/web_response.py", line 458, in _write_headers await writer.write_headers(status_line, self._headers) File "/usr/local/lib/python3.8/site-packages/aiohttp/http_writer.py", line 119, in write_headers self._write(buf) File "/usr/local/lib/python3.8/site-packages/aiohttp/http_writer.py", line 67, in _write raise ConnectionResetError("Cannot write to closing transport") ConnectionResetError: Cannot write to closing transport 2021-05-04 09:35:54 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 09:35:54 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 09:40:47 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 09:40:48 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 09:50:51 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 09:50:51 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 10:29:39 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 10:29:40 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 10:55:18 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 10:55:18 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 11:12:32 WARNING (MainThread) [homeassistant.components.automation.new_code] New Code: Already running 2021-05-04 11:32:46 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 11:32:46 WARNING (MainThread) [homeassistant.components.mqtt.mixins] JSON result was not a dictionary 2021-05-04 11:47:10 WARNING (MainThread) [homeassistant.components.http.ban] Login attempt or request with invalid authentication from vodafone.broadband (192.168.0.1). (Mozilla/5.0 (iPhone; CPU iPhone OS 14_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Home Assistant/2021.4.1 (io.robbie.HomeAssistant; build:2021.115; iOS 14.5.0) Mobile/HomeAssistant, like Safari)You’re welcome. When you do a Snapshot restore in HA, you have the option to disable portions of what’s to be restored from the Snapshot. This is how a person could revert to 5.1. After installing the buggy 5.1.1, make a backup of everything (for safety), and then use an earlier Snapshot to restore only the 5.1 version Mosquitto broker. That process won’t work if you don’t already have a Snapshot with 5.1 on it, of course, but it’s good to know it’s there for the next time.
Unfortunately I don’t have the ability to roll back as all my snapshots were partial (that won’t happen again). I’m unable to add any new devices (Tasmota) even though it looks good in the console. Frustrating. Really not sure what to do about it. I’ve considered installing mosquitto on a different machine, but I’m using HA authentication. I’ve not been able to fix it with the uninstall reinstall thing others are saying works, it just goes back to not working. I’m willing to work with anyone that can address it, I’m unfortunately not a developer… At this point I’m considering downloading an old version and starting over (this is a LAST resort as I’ve put a lot of time into HA). It sucks having to stand up and walk ALL the way across the room to turn lights on and off.
I also had to revert to 5.1 to have a functional setup (otherwise I had crazy delays each time I was switching on a light). This seems like an important issue. And I am afraid, there is way more than you might think between the 5.1.1 update. Unfortunately, I don’t see this issue getting a lot of attention. Is there anything we can do to clarify or reproduce the issue? How can we help getting this solved?
There are several options mentioned in this thread.
‘docker pull homeassistant/aarch64-addon-mosquitto:5.1’
@hung2kgithub has a solution with an own repository somewhere i the middle of this thread
please just check the whole thread if you need a solution for you own situation, also remember to backup and maybe disable auto update on production systems. awareness for such points mostly has to be learned the hard way
I’m really struggling for days with this issue since upgrading to 5.1.1, I’ve all my z-wave devices that are unavailable. I think I’ve tried every solutions mentionned here to get it working without success. Unfortunately, no back-up old enough to get me back with the 5.1 version. I’ve tried to fork the repository but then add it does not work at all. I’m stuck. Countless hours lost to solve this issue… Too many automation in the house so this is huge problem. As it’s been week that the problem is around, I do suppose it will not be solved by an upgrade. If someone can point me out how to get back to 5.1, that would be very great. Thanks in advance.
I am noticing slightly higher latency with this new version as well. A typical work flow of mine is turning on a zigbee light from a zigbee remote. What is weird is if I haven’t turned a light on/off for some time, when I first press the on or off button on the remote, there is a noticeable (~1s) delay before the light responds. But, having just used the remote, if I then press a button a few seconds later, the response is almost instantaneous (as was always the case with 5.1).
Not sure if there is something with the new version that causes the MQTT server to go to sleep, resulting in that first action being slower. Or, if there is a different timeout on the connection between zigbee2mqtt and the broker, resulting in zigbee2mqtt having to reconnect when the first action is submitted in some time.
can confirm that, upon cogneato’s suggestion in Discord, deleting the 5.1 add-on, (copying the config) and re-installing (the now new 5.1.1) Add-on with the copied config, everything is running smoothly. NO errors in the log, and all topics are live.
So, don’t update, but re-install which gets you the new version (and rewrites the mosquito.db) which seems to be the issue. which essentially is what @christoph-luebbe said here https://github.com/home-assistant/addons/issues/1887#issuecomment-802234693
Sigh… Rolled back to 5.1, and still can’t get my Sonoff Zigbee bridge to work 😦
Probably the “only” is missing, as @chumbazoid is certainly not the only one: I’m also using HA Supervised on Debian 10, but have no time to experiment with removing-reinstalling MQTT (and also use a few topics to store information with retained messages), I’m fine with 5.1 for now.
@chumbazoid I believe you are the only person on debian+supervised to report being affected by this. (Although most people unhelpfully don’t say)
Guys it’s 2 weeks since the issue has been identified and confirmed that v5.1 is working properly why v5.1.1 hasn’t been rolled back yet? I assume more and more people are being affected by this issue.
No snapshot here. Is there any way to use command line to downgrade to 5.1?
ha addons install ???
thx.
I made a snapshot on my HASS Testsystem (was still running on Mosquitto 5.1) and did the partial rollback on my Prod System.