core: Z-wave Aeotec stick is not started up correctly.
Home Assistant release with the issue: 0.87.1
Last working Home Assistant release (if known): Been unstable since I started using HA a month ago on 0.86
Operating environment (Hass.io/Docker/Windows/etc.): Hass.io under docker in Ubuntu host on NUC
Component/platform: Zwave
Description of problem: When I restart HA I have a 50/50 chance/risk that Z-wave is not comming up. And it is often many boots in a row and seems to be worse when you do it often
The USB device is there on ttyACM0. I check everytime. It is there and visible in the container. I am using an Aeotec USB stick.
It is always at startup it fails. If it comes up it is stable. I have checked USB voltage. I have checked the stick on a Windows machine. There is nothing wrong with the stick. It is HA that cannot initialize it in a reliable way (or OZW).
Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):
Traceback (if applicable): When it fails the OZW log always contains something like this
2019-02-18 08:27:51.765 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 08:27:51.765 Warning, WARNING: Out of frame flow! (0x02). Sending NAK.
2019-02-18 08:27:53.267 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 08:27:53.267 Warning, WARNING: Out of frame flow! (0x02). Sending NAK.
2019-02-18 08:27:53.767 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 08:27:53.767 Detail, contrlr, Received: 0x01, 0x02, 0x08, 0x1d
2019-02-18 08:27:53.767 Warning, contrlr, WARNING: Checksum incorrect - sending NAK
2019-02-18 08:28:01.166 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
or this
2019-02-18 06:55:16.690 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2019-02-18 06:55:17.682 Detail, Unsolicited message received while waiting for ACK.
2019-02-18 06:55:18.183 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 06:55:18.183 Warning, contrlr, WARNING: NAK received...triggering resend
2019-02-18 06:55:18.183 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2019-02-18 06:55:18.183 Detail, contrlr, Removing current message
2019-02-18 06:55:18.183 Warning, WARNING: Out of frame flow! (0x5a). Sending NAK.
Not always the same. It does not matter that the communication seems to continue without hickups. Once failed HA refuses to finish the Z-Wave setup and you end up with no devices or only the Z-wave stick as a single device. And then you can restart and reboot and suddenly it comes back. And then you may boot or restart HA 10 times without a problem. There is a timing issue or an issue with not closing the comm port. The device contains a battery so it may be left in limbo mode. I cannot be sure.
This makes HA so unreliable that I cannot trust it to guard my house. I have seen this issue raise several times on the bug tracker - like here https://github.com/home-assistant/home-assistant/issues/15828 (closed without a fix) - and a very old one https://github.com/home-assistant/home-assistant/issues/5501 (still open) with an error mode which is similar but not the same
I add 4 OZW logs. The ones with ERR are where HA would not start Z-wave and the OK ones are where everything works.
I have once been in a situation where 20 restarts did not recover and I had to reinstall a snapshot from a few days back which revived the Z-wave again. So it can fail in a pretty serious way.
OZW_Log1902180657ERR.txt OZW_Log1902131926OK.txt OZW_Log1902131922ERR.txt OZW_Log1902131913OK.txt
Additional information:
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 41 (10 by maintainers)
Commits related to this issue
- Check for Invalid HomeID or Controller Node ID's ModemManager on the Host is known to interfere with OZW, and cause problems when starting up. Users will often get messages about "Out of Frame Flow" ... — committed to Fishwaldo/open-zwave by Fishwaldo 5 years ago
- Check for Invalid HomeID or Controller Node ID's (#10) ModemManager on the Host is known to interfere with OZW, and cause problems when starting up. Users will often get messages about "Out of Frame ... — committed to home-assistant/open-zwave by Fishwaldo 5 years ago
- Fix comparison uint8 with -1, also test if controller node is 0. Ported from OZW 1.6 to OZW 1.4 HASS fork. lldb complained: Driver.cpp:2701:43: error: result of comparison of constant -1 with ... — committed to petergebruers/open-zwave by jeffsanicola 5 years ago
- Fix comparison uint8 with -1, also test if controller node is 0. Ported from OZW 1.6 to OZW 1.4 HASS fork. lldb complained: Driver.cpp:2701:43: error: result of comparison of constant -1 with ... — committed to petergebruers/open-zwave by petergebruers 5 years ago
- Fix comparison uint8 with -1, also test if controller node is 0. Ported from OZW 1.6 to OZW 1.4 HASS fork. lldb complained: Driver.cpp:2701:43: error: result of comparison of constant -1 with ... — committed to petergebruers/open-zwave by petergebruers 5 years ago
- Fix comparison uint8 with -1, also test if controller node is 0. (#16) Ported from OZW 1.6 to OZW 1.4 HASS fork. lldb complained: Driver.cpp:2701:43: error: result of comparison of constan... — committed to home-assistant/open-zwave by petergebruers 5 years ago
Hi. So I have now tried this: https://linux-tips.com/t/prevent-modem-manager-to-capture-usb-serial-devices/284
(https://askubuntu.com/questions/399263/udev-rules-seem-ignored-can-not-prevent-modem-manager-from-grabbing-device) I believe this is really the way to go…
But what I ended up doing was systemctl disable ModemManager.service
Have now restarted NUC 2 times, and many restarts of Hassio. Seem to work fine. Will continue testing it
We have the fix now! Finnaly!
I decided to put my zwave back on HA. I need to restart many many more times or a period of time to confirm the fix. So far it looks good.
The fix was from the host Ubuntu to diable modem manager with systemctl disable ModemManager and then just reboot the whole machine
I will report back. If this works out the short term is to get this in the Zwave documentation and hopefully later someone can improve HA so it retries after a minute if the startup of Zwave fails. It was @Adminiuga saying the word ModemManager that made this breakthrough
Hmm. That is a good idea. Unfortunately I no longer have a HA with an Aeotec stick but if someone still having the HW and the problem could try this. I assume the thing to do is to disable modem manager in Ubuntu host. I cannot test these but here is what I found on Google search
https://linux-tips.com/t/prevent-modem-manager-to-capture-usb-serial-devices/284
https://askubuntu.com/questions/399263/udev-rules-seem-ignored-can-not-prevent-modem-manager-from-grabbing-device
https://www.freedesktop.org/software/ModemManager/api/latest/ref-overview-modem-filter.html
https://stackoverflow.com/questions/24696527/modem-manager-and-ttyacm-in-use
There are plenty more. If this is the root cause then there is still a todo for HA because it should simply wait and try again if it fails like it does with so many other devices. The most primitive work around would be to wait one minute and try again.
@KennethLavrsen yes, it happens after restarts because hassio HA container is triggering udev, which in turn may trigger other applications like Ubuntu’s Modem manager.