core: Tradfri: manual reload of the integration fails

The problem

follow-up of https://github.com/home-assistant/core/issues/85254#issuecomment-1838614615

which was an attempt to keep some oversight in the Ikea Integration issue of losing connection to the Gateway. that hasn’t changed. New is the fact that even though an automated integration reload seems to initially work alright (bringing back the entities in the dashboard), clicking any of those entities, or, controlling them via scripts etc leads to None type error.

When we try to manually reload the integration once more, there is an error leading to

Scherm­afbeelding 2023-12-04 om 11 34 50

and below error sequence is in the log IMG_3708

Requiring a HA restart, which makes the issue of HA losing connection even worse than before. Having to reload an integration has been a ok-ish working hack to keep things alive. Having to restart it seriously getting to be critical.

I did see this happening before the Ikea firmware update to dec 4th.

What version of Home Assistant Core has the issue?

2023.12.0b1

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Tradfri

Link to integration documentation on our website

https://www.home-assistant.io/integrations/tradfri/

Diagnostics information

config_entry-tradfri-01c28e550efaa39e54aaf8180d8c10d8.json-2.txt

Example YAML snippet

automated reload that worked up to know.

script:

  reload_tradfri_integration:
    mode: restart
    sequence:
      service: homeassistant.reload_config_entry
      data:
        entry_id: 01c28e550efaa39e54aaf8180d8c10d8

automation:

  - id: reload_tradfri_integration
    mode: single
    trigger:
      - platform: event
        event_type: system_log_event
        event_data:
          level: WARNING
        id: observation
      - platform: event
        event_type: system_log_event
        event_data:
          level: ERROR
        id: failed
    condition:
      - >
        {{now() - this.attributes.last_triggered > timedelta(minutes=5) if
          this.attributes.last_triggered is not none else true}}
#       or:
#         - >
#           {{trigger.event.data.name == 'tradfri.base_class' and 'Observation failed for'
#             in trigger.event.data.message[0]}}
      - >
        {{trigger.event.data.name == 'homeassistant.components.tradfri' and
          'Keep-alive failed' in trigger.event.data.message[0]}}
#         - >
#           {{trigger.event.data.name == 'coap' and
#               'Error received in UDP connection under DTLS' in trigger.event.data.message[0]}}
    action:
      - service: counter.increment
        target:
          entity_id: counter.reload_tradfri_integration
      - service: system_log.write
        data:
          message: >
            Ikea Trådfri: '{{trigger.id}}' on {{trigger.event.data.name}} issue logged:
            {{trigger.event.data.message[0]}}.
            Reloading integration
          level: critical
          logger: homeassistant.components.tradfri
      - service: script.reload_tradfri_integration
      - service: system_log.write
        data:
          message: >
            Ikea Trådfri integration was reloaded because of {{trigger.id}}
            Full data:
            {{trigger.event}}
          level: critical
          logger: homeassistant.components.tradfri

Anything in the logs that might be useful for us?

Logger: root
Source: /usr/src/homeassistant/homeassistant/bootstrap.py:402
First occurred: 10:21:50 (76 occurrences)
Last logged: 11:30:16

Uncaught exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/aiocoap/transports/tinydtls.py", line 170, in wrapper
    return getattr(self, __method)(*args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiocoap/transports/tinydtls.py", line 280, in _read
    self.coaptransport.ctx.dispatch_message(message)
  File "/usr/local/lib/python3.11/site-packages/aiocoap/messagemanager.py", line 109, in dispatch_message
    success = self._process_response(message)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiocoap/messagemanager.py", line 335, in _process_response
    return self.token_manager.process_response(response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiocoap/tokenmanager.py", line 167, in process_response
    if key not in self.outgoing_requests:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: argument of type 'NoneType' is not iterable

a subsequent manual reload of the integration now also errors with:

Logger: homeassistant.config_entries
Source: config_entries.py:557
First occurred: 11:30:46 (1 occurrences)
Last logged: 11:30:46

Error unloading entry Ikea Trådfri for tradfri
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 557, in async_unload
    result = await component.async_unload_entry(hass, self)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/tradfri/__init__.py", line 149, in async_unload_entry
    await factory.shutdown()
  File "/usr/local/lib/python3.11/site-packages/pytradfri/api/aiocoap_api.py", line 99, in shutdown
    await self._reset_protocol(exc)
  File "/usr/local/lib/python3.11/site-packages/pytradfri/api/aiocoap_api.py", line 87, in _reset_protocol
    await protocol.shutdown()
  File "/usr/local/lib/python3.11/site-packages/aiocoap/protocol.py", line 326, in shutdown
    await item
  File "/usr/local/lib/python3.11/site-packages/aiocoap/tokenmanager.py", line 61, in shutdown
    await self.token_interface.shutdown()
  File "/usr/local/lib/python3.11/site-packages/aiocoap/messagemanager.py", line 65, in shutdown
    for messageerror_monitor, cancellable in self._active_exchanges.values():
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'values'

followed up with:

2023-12-04 20:38:13.545 WARNING (MainThread) [py.warnings] /usr/local/lib/python3.11/site-packages/aiocoap/transports/tinydtls.py:166: UserWarning: DTLS module did not shut down the DTLSSocket perfectly; it still tried to call _write in vain
  warnings.warn("DTLS module did not shut down the DTLSSocket "

Additional information

No response

About this issue

  • Original URL
  • State: open
  • Created 7 months ago
  • Reactions: 1
  • Comments: 25 (15 by maintainers)

Most upvoted comments

using a Ikea gateway 1st generation and also a Dirigera . the latter only on experimental base, as it isnt supported other than via HomeKit controller

at the time these loss of connection issues started (2023.1) the Dirigera wasnt there yet, and I dont believe it has to do with Ikea purposely doing that, as the App still works as expected, and even the Home app on my iPhone has no issues. Its really a HA thing (or any of its underlying libraries, like COAP)

2023.12 started with the unloading trouble. Though, I havent seen that for a few connection losses now, so it doesnt happen all the time (before it never did, and we could always restart the integration)

the frequency did go up and sometimes even very quickly too, as if the reload doesnt fully reconnect, or starts a new issue I itself…

Hope you find something, though in all honesty, Ive almost lost faith HA will ever bring this back to regular operation. Which is too bad really, as Ikea still develops its firmware.

Home Assistant should really get a code owner on this integration to recheck its workings and make it up to date again.

please do because the frequency of failures has gone up, and the result has become more fatal: only a Home assistant restart can fix it (where before the integration reload sufficed)

what exactly do you mean with

Add/Remove after firmware update

?

ok, so HA 2024.2 beta seems to report differerent loggings, which I hope can help pinpoint this a bit better.

Upon each restart we now see the Aiocoap message (this didnt happen before, as it was accidental)

Logger: py.warnings
Source: /usr/local/lib/python3.12/warnings.py:109
First occurred: 09:22:08 (1 occurrences)
Last logged: 09:22:08

/usr/local/lib/python3.12/site-packages/aiocoap/transports/tinydtls.py:166: UserWarning: DTLS module did not shut down the DTLSSocket perfectly; it still tried to call _write in vain warnings.warn("DTLS module did not shut down the DTLSSocket "

and next:

Logger: coap
Source: runner.py:188
First occurred: 09:22:08 (1 occurrences)
Last logged: 09:22:08

Connection loss was not expected.

I must mention that here, because it then leads to our automation trying to read the integration, as is the subject of this issue.

Logger: homeassistant.components.tradfri
Source: components/system_log/__init__.py:306
Integration: IKEA TRÅDFRI (documentation, issues)
First occurred: 09:22:09 (1 occurrences)
Last logged: 09:22:09

Ikea Trådfri: 'failed' on coap issue logged: Connection loss was not expected.. Reloading integration

and this is the latest, which I hope has more info for the devs to check what is happening:

Logger: homeassistant.components.script.reload_tradfri_integration
Source: helpers/script.py:1805
Integration: Script (documentation, issues)
First occurred: 09:22:09 (1 occurrences)
Last logged: 09:22:09

reload_tradfri_integration: Error executing script. Error for call_service at pos 1: The config entry Ikea Trådfri (tradfri) with entry_id 01c28e550efaa39e54aaf8180d8c10d8 cannot be unloaded because it is not in a recoverable state (ConfigEntryState.SETUP_IN_PROGRESS)

and finally this again:

Logger: py.warnings
Source: components/unifi/controller.py:384
First occurred: 09:23:47 (1 occurrences)
Last logged: 09:23:47

/usr/local/lib/python3.12/site-packages/aiocoap/transports/tinydtls.py:166: UserWarning: DTLS module did not shut down the DTLSSocket perfectly; it still tried to call _write in vain warnings.warn("DTLS module did not shut down the DTLSSocket "

so even though I try to keeps things separate, it must be seen in connection to the other issue I am afraid.

so, Integration failing more and more daily now. While at the same time, eg Google Assistant doesnt have an issue at all, or, ofc, the Ikea Tradfri app itself.

thing ism tha any script or automation using an Integration entity is suffering, and we can do but s much to guard all those entities.

just look at this…

Scherm­afbeelding 2024-01-31 om 10 41 50

its horror currently, and no one is bothered?

you’d better not ping like that, it is considered rude… for polls we have the community, or eg Discord

that known issues paragraph in fact says, users are on their own.

how are users supposed to:

As a first solution, try to limit the number of requests, and move the Gateway closer to the device running pytradfri on the nework

If the gateway is lan cabled to the HA system, and request, well, are out of control other than turning on/off a light now and then…

I wonder how they can state this project can do nothing, if it is not seriously investigated, and ‘anecdotal’ evidence proves the issue in increasing. only so by either updating HA, or the Tadfri gateway.

its been since januari this is getting closed and closed again https://github.com/home-assistant-libs/pytradfri/issues/617#issuecomment-1399367301 without as much a hint of help

failing more tha ever, and more vehemently than ever. My auto reload automation has been working since January, and didnt cause an issue.

currently I believe 1 out if 3, it fails to reload and the aforementioned error pops up…

so this has definitely gotten worse, up to the breaking level I would say.

Too bad there’s no response from the team, as if we’re writing in the void. Especially since this has been going on for almost a year now…