core: Z-wave Aeotec stick is not started up correctly.

Home Assistant release with the issue: 0.87.1

Last working Home Assistant release (if known): Been unstable since I started using HA a month ago on 0.86

Operating environment (Hass.io/Docker/Windows/etc.): Hass.io under docker in Ubuntu host on NUC

Component/platform: Zwave

Description of problem: When I restart HA I have a 50/50 chance/risk that Z-wave is not comming up. And it is often many boots in a row and seems to be worse when you do it often

The USB device is there on ttyACM0. I check everytime. It is there and visible in the container. I am using an Aeotec USB stick.

It is always at startup it fails. If it comes up it is stable. I have checked USB voltage. I have checked the stick on a Windows machine. There is nothing wrong with the stick. It is HA that cannot initialize it in a reliable way (or OZW).

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):


Traceback (if applicable): When it fails the OZW log always contains something like this

2019-02-18 08:27:51.765 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 08:27:51.765 Warning, WARNING: Out of frame flow! (0x02).  Sending NAK.
2019-02-18 08:27:53.267 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 08:27:53.267 Warning, WARNING: Out of frame flow! (0x02).  Sending NAK.
2019-02-18 08:27:53.767 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 08:27:53.767 Detail, contrlr,   Received: 0x01, 0x02, 0x08, 0x1d
2019-02-18 08:27:53.767 Warning, contrlr, WARNING: Checksum incorrect - sending NAK
2019-02-18 08:28:01.166 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)

or this

2019-02-18 06:55:16.690 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2019-02-18 06:55:17.682 Detail, Unsolicited message received while waiting for ACK.
2019-02-18 06:55:18.183 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2019-02-18 06:55:18.183 Warning, contrlr, WARNING: NAK received...triggering resend
2019-02-18 06:55:18.183 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2019-02-18 06:55:18.183 Detail, contrlr, Removing current message
2019-02-18 06:55:18.183 Warning, WARNING: Out of frame flow! (0x5a).  Sending NAK. 

Not always the same. It does not matter that the communication seems to continue without hickups. Once failed HA refuses to finish the Z-Wave setup and you end up with no devices or only the Z-wave stick as a single device. And then you can restart and reboot and suddenly it comes back. And then you may boot or restart HA 10 times without a problem. There is a timing issue or an issue with not closing the comm port. The device contains a battery so it may be left in limbo mode. I cannot be sure.

This makes HA so unreliable that I cannot trust it to guard my house. I have seen this issue raise several times on the bug tracker - like here https://github.com/home-assistant/home-assistant/issues/15828 (closed without a fix) - and a very old one https://github.com/home-assistant/home-assistant/issues/5501 (still open) with an error mode which is similar but not the same

I add 4 OZW logs. The ones with ERR are where HA would not start Z-wave and the OK ones are where everything works.

I have once been in a situation where 20 restarts did not recover and I had to reinstall a snapshot from a few days back which revived the Z-wave again. So it can fail in a pretty serious way.

OZW_Log1902180657ERR.txt OZW_Log1902131926OK.txt OZW_Log1902131922ERR.txt OZW_Log1902131913OK.txt

Additional information:

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 41 (10 by maintainers)

Commits related to this issue

Most upvoted comments

Hi. So I have now tried this: https://linux-tips.com/t/prevent-modem-manager-to-capture-usb-serial-devices/284

(https://askubuntu.com/questions/399263/udev-rules-seem-ignored-can-not-prevent-modem-manager-from-grabbing-device) I believe this is really the way to go…

But what I ended up doing was systemctl disable ModemManager.service

Have now restarted NUC 2 times, and many restarts of Hassio. Seem to work fine. Will continue testing it

We have the fix now! Finnaly!

I decided to put my zwave back on HA. I need to restart many many more times or a period of time to confirm the fix. So far it looks good.

The fix was from the host Ubuntu to diable modem manager with systemctl disable ModemManager and then just reboot the whole machine

I will report back. If this works out the short term is to get this in the Zwave documentation and hopefully later someone can improve HA so it retries after a minute if the startup of Zwave fails. It was @Adminiuga saying the word ModemManager that made this breakthrough

Hmm. That is a good idea. Unfortunately I no longer have a HA with an Aeotec stick but if someone still having the HW and the problem could try this. I assume the thing to do is to disable modem manager in Ubuntu host. I cannot test these but here is what I found on Google search

https://linux-tips.com/t/prevent-modem-manager-to-capture-usb-serial-devices/284

https://askubuntu.com/questions/399263/udev-rules-seem-ignored-can-not-prevent-modem-manager-from-grabbing-device

https://www.freedesktop.org/software/ModemManager/api/latest/ref-overview-modem-filter.html

https://stackoverflow.com/questions/24696527/modem-manager-and-ttyacm-in-use

There are plenty more. If this is the root cause then there is still a todo for HA because it should simply wait and try again if it fails like it does with so many other devices. The most primitive work around would be to wait one minute and try again.

@KennethLavrsen yes, it happens after restarts because hassio HA container is triggering udev, which in turn may trigger other applications like Ubuntu’s Modem manager.