core: Zigbee ZHA locks up and problems controlling

The problem

Af the update 2024.4 the zigbee/ZHA/Skyconnect 1.0 network have “locked” up several times, with the problem of not being able to control anything. Only a restart of HA got the integration running again.

Tonight the whole network seems to have problems, with problem sending command with 10-20 seconds with commands being sent, or not at all. With HA displaying errors.

several errors seems to present in the logs.

What version of Home Assistant Core has the issue?

core-2024.3.0

What was the last working version of Home Assistant Core?

core-2024.2.4

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ZHA

Link to integration documentation on our website

https://www.home-assistant.io/integrations/zha/

Diagnostics information

home-assistant_zha_2024-03-09T20-55-27.032Z.log

Example YAML snippet

No response

Anything in the logs that might be useful for us?

Logger: homeassistant.components.websocket_api.http.connection
Source: components/websocket_api/commands.py:239
integration: Home Assistant WebSocket API (documentation, issues)
First occurred: 21:43:39 (1 occurrences)
Last logged: 21:43:39

[140090215724224] Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 63, in wrap_zigpy_exceptions
    yield
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 83, in wrapper
    return await RETRYABLE_REQUEST_DECORATOR(func)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/util.py", line 131, in retry
    return await func()
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/zcl/__init__.py", line 377, in request
    return await self._endpoint.request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/endpoint.py", line 253, in request
    return await self.device.request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/device.py", line 339, in request
    await send_request()
  File "/usr/local/lib/python3.12/site-packages/zigpy/application.py", line 841, in request
    await self.send_packet(
  File "/usr/local/lib/python3.12/site-packages/bellows/zigbee/application.py", line 931, in send_packet
    raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 239, in handle_call_service
    response = await hass.services.async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2319, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2356, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 905, in entity_service_call
    single_response = await _handle_entity_call(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 975, in _handle_entity_call
    result = await task
             ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 628, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 360, in async_turn_on
    result = await self._on_off_cluster_handler.on()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 82, in wrapper
    with wrap_zigpy_exceptions():
  File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 74, in wrap_zigpy_exceptions
    raise HomeAssistantError(message) from exc
homeassistant.exceptions.HomeAssistantError: Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>

Additional information

No response

About this issue

  • Original URL
  • State: closed
  • Created 4 months ago
  • Reactions: 1
  • Comments: 17 (3 by maintainers)

Most upvoted comments

I put it off for several months as I read the warning it could be a breaking upgrade.

Which one are you referring to?

My belief (still reading through the commit history), is that the newer firmware enforces this failure due to noise and bubbles up the exception.

We actually disabled (as much as possible) this firmware feature in August. If you haven’t explicitly updated the firmware on your Yellow, you’re running the firmware that it came with. ZHA won’t flash new firmware.

make it clear to the user that the Firmware is throwing because of interference

Unfortunately, this isn’t easily possible. The firmware just tells you SUCCESS or FAILURE. Why something failed isn’t accessible to the application (ZHA) beyond the global counters I referenced above.

You are very right, however, that channel access failures should be reported better. It’s something that’s being worked on.

The statistics from the radio that were pulled from your diagnostics info don’t lie:

MAC_TX_UNICAST_SUCCESS = 943
MAC_TX_UNICAST_RETRY = 4294
MAC_TX_UNICAST_FAILED = 1634
PHY_CCA_FAIL_COUNT = 2784

Over 40% of your requests outright fail because the radio’s firmware refused to transmit due to noise. The integration can’t control that, it’s your environment.

Unless your network was loaded with a ton of Tuya devices, the only change that Z2M performs is moving Zigbee network to channel 11. ZHA generally avoids channel 11 when picking your network’s channel.