core: SONOFF ZigBee 3.0 Dongle E going unresponsive after a while

The problem

I have a setup with a couple dozen Zigbee devices, coordinated by a SONOFF ZigBee 3.0 Dongle E. Every couple days the dongle / ZHA crash in an unrecoverable way, which I can only solve by physically unplugging and replugging the USB stick. I’m using firmware 7.3.1.0 build 176, flashed by the web-based tool.

I’ve captured a couple stacktraces now when this happened, hopefully there’s something meaningful in there. Let me know what other details I can provide.

What version of Home Assistant Core has the issue?

core-2023.12.3

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Container

Integration causing the issue

zha

Link to integration documentation on our website

https://www.home-assistant.io/integrations/zha/

Diagnostics information

Diagnostics, captured after recovering from a failure already:

zha-4a8c1b710746a49bb72205002a49a0b7-Zigbee Coordinator-8733fd8a1920e10f5420c786d6c31417.json.txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

Example 1, going unresponsive:

2023-12-20 21:13:34.973 DEBUG (MainThread) [homeassistant.components.zha] Failed to set up ZHA
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 261, in reset
    return await self._reset_future
           ^^^^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/__init__.py", line 163, in async_setup_entry
    zha_gateway = await ZHAGateway.async_from_config(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 182, in async_from_config
    await instance.async_initialize()
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 200, in async_initialize
    await app.startup(auto_form=True)
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 223, in startup
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 139, in connect
    await ezsp.startup_reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 119, in startup_reset
    await self.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 145, in reset
    await self._gw.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 260, in reset
    async with asyncio_timeout(RESET_TIMEOUT):
  File "/usr/local/lib/python3.11/asyncio/timeouts.py", line 111, in __aexit__
    raise TimeoutError from exc_val
TimeoutError

-----------

Example #2, again going unresponsive:

2023-12-21 16:54:19.184 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: TimeoutError()
2023-12-21 16:54:19.184 WARNING (MainThread) [zigpy.application] Watchdog failure
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/protocol.py", line 68, in command
    return await future
           ^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 661, in _watchdog_loop
    await self._watchdog_feed()
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 893, in _watchdog_feed
    (res,) = await self._ezsp.readCounters()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/protocol.py", line 67, in command
    async with asyncio_timeout(EZSP_CMD_TIMEOUT):
  File "/usr/local/lib/python3.11/asyncio/timeouts.py", line 111, in __aexit__
    raise TimeoutError from exc_val
TimeoutError
2023-12-21 16:54:19.264 DEBUG (MainThread) [zigpy.application] Connection to the radio has been lost: TimeoutError()
2023-12-21 16:54:19.266 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Connection to the radio was lost: TimeoutError()
2023-12-21 16:54:19.266 DEBUG (MainThread) [zigpy.application] Stopping watchdog loop
2023-12-21 16:54:19.268 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Shutting down ZHA ControllerApplication

2023-12-21 16:54:26.946 DEBUG (MainThread) [homeassistant.components.zha] Failed to set up ZHA
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 261, in reset
    return await self._reset_future
           ^^^^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/__init__.py", line 163, in async_setup_entry
    zha_gateway = await ZHAGateway.async_from_config(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 182, in async_from_config
    await instance.async_initialize()
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 200, in async_initialize
    await app.startup(auto_form=True)
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 223, in startup
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 139, in connect
    await ezsp.startup_reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 119, in startup_reset
    await self.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 145, in reset
    await self._gw.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 260, in reset
    async with asyncio_timeout(RESET_TIMEOUT):
  File "/usr/local/lib/python3.11/asyncio/timeouts.py", line 111, in __aexit__
    raise TimeoutError from exc_val
TimeoutError

----

Example 2, after remounting the USB drive from the command line and restarting Home Assistant, still failing:

2023-12-21 17:02:45.033 DEBUG (MainThread) [homeassistant.components.zha] Failed to set up ZHA
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 261, in reset
    return await self._reset_future
           ^^^^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/__init__.py", line 163, in async_setup_entry
    zha_gateway = await ZHAGateway.async_from_config(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 182, in async_from_config
    await instance.async_initialize()
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 200, in async_initialize
    await app.startup(auto_form=True)
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 223, in startup
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 139, in connect
    await ezsp.startup_reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 119, in startup_reset
    await self.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 145, in reset
    await self._gw.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 260, in reset
    async with asyncio_timeout(RESET_TIMEOUT):
  File "/usr/local/lib/python3.11/asyncio/timeouts.py", line 111, in __aexit__
    raise TimeoutError from exc_val
TimeoutError

Additional information

No response

About this issue

  • Original URL
  • State: open
  • Created 6 months ago
  • Comments: 21 (5 by maintainers)

Most upvoted comments

Update: For me the latest updates seemed to have resolved whatever issue there was, Zigbee working normally now. Running: Core 2024.1.2 Supervisor 2023.12.0 Operating System 11.3 Frontend 20240104.0