core: `aioesphomeapi` floods of errors+nodes slow
The problem
Since 2023.2, log is flooded with errors like below. ESPHome nodes go randomly offline, and became slow-responsive.
Things which are certain:
- Before HA 2023.2 everything was nice and snappy, no errors in the logs
- ESPHome nodes have up-to date firmware (at the time of writing this 2023.2.4)
- HA 2023.3 didn’t fix this.
What version of Home Assistant Core has the issue?
2023.3.0
What was the last working version of Home Assistant Core?
2023.1.x
What type of installation are you running?
Home Assistant OS
Integration causing the issue
aioesphomeapi
Link to integration documentation on our website
https://www.home-assistant.io/integrations/esphome/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
2023-03-02 15:40:11.629 WARNING (MainThread) [aioesphomeapi.connection] lampa-wc-fent @ 192.168.20.11: Connection error occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 537, in send_message_await_response_complex
await fut
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 536, in send_message_await_response_complex
async with async_timeout.timeout(timeout):
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 320, in _keep_alive_loop
await self._ping()
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 645, in _ping
await self.send_message_await_response(PingRequest(), PingResponse)
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 554, in send_message_await_response
res = await self.send_message_await_response_complex(
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 539, in send_message_await_response_complex
raise TimeoutAPIError(
aioesphomeapi.core.TimeoutAPIError: Timeout waiting for response for <class 'api_pb2.PingRequest'> after 10.0s
2023-03-02 15:40:12.462 WARNING (MainThread) [aioesphomeapi.connection] lampa-venti-zuhany @ 172.22.215.32: Connection error occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 537, in send_message_await_response_complex
await fut
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 536, in send_message_await_response_complex
async with async_timeout.timeout(timeout):
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 320, in _keep_alive_loop
await self._ping()
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 645, in _ping
await self.send_message_await_response(PingRequest(), PingResponse)
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 554, in send_message_await_response
res = await self.send_message_await_response_complex(
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/connection.py", line 539, in send_message_await_response_complex
raise TimeoutAPIError(
aioesphomeapi.core.TimeoutAPIError: Timeout waiting for response for <class 'api_pb2.PingRequest'> after 10.0s
Additional information
Accesing HA user interface through a reverse proxy became also painfully slow - not sure if it has to do with this, but the symptoms appeared in the same time.
https://github.com/home-assistant/core/issues/88535 https://github.com/home-assistant/core/issues/85432#issuecomment-1422633517 https://github.com/home-assistant/core/issues/87158#issuecomment-1436493301
https://discord.com/channels/330944238910963714/1012073425638015037/1061867739863597127
The aioesphomeapi lib has also been optimized reduce latency and improve performance as well on dev (will help all esphome devices not just ones with a Bluetooth proxy). At this point any future improvements are probably going to come from ESP – IDF 5 and the protobuf cpp switch (bdraco — 2023.01.09. 5:42)
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 126 (111 by maintainers)
Commits related to this issue
- Cache transient templates compiles provided via api partially fixes #89047 (there is more going on here) — committed to home-assistant/core by bdraco a year ago
- Cache transient templates compiles provided via api (#89065) * Cache transient templates compiles provided via api partially fixes #89047 (there is more going on here) * add a bit more coverage... — committed to home-assistant/core by bdraco a year ago
- Bump aioesphomeapi to 13.4.2 fixes #85062 fixes #88535 fixes #89047 — committed to home-assistant/core by bdraco a year ago
- Cache transient templates compiles provided via api (#89065) * Cache transient templates compiles provided via api partially fixes #89047 (there is more going on here) * add a bit more coverage... — committed to home-assistant/core by bdraco a year ago
I made some more tweaks to https://github.com/esphome/aioesphomeapi/pull/395
With the changes, even the device with the poor connection now stays connected to HA.
I’ve published the new version as https://github.com/bdraco/esphome_testing
You can install it manually by copying it into
/config/custom_components/esphomeIf you have HACS you can do it through HACS as a custom repo under HACS->Integrations->… menu->Custom Repo
After restarting check your log to make sure you are running the custom version
2023-03-06 05:09:01.898 WARNING (SyncWorker_2) [homeassistant.loader] We found a custom integration esphome which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home AssistantNo disconnects overnight with https://github.com/esphome/aioesphomeapi/pull/395 for the devices with poor wifi connections
awesome! I see that this will be merged for the next core release 2023.3.2. I’ll make sure to delete the HACS CC, and then update and then restart HA (if that is the right wat to do this).
I’ve restored
power_save_mode: LIGHTto some of the previously affected ESP8266 nodes after installing this CC, let’s see…I opened a PR to make these a bit easier to track as well https://github.com/home-assistant/core/pull/89171
I’m open to test it as a custom component for you.
Enabling
power_save_mode: LIGHTcauses my devices to disconnect as well.With https://github.com/esphome/aioesphomeapi/pull/395 I’m not seeing the disconnects anymore
Because I deliberately enabled the debugger.
I’ll let it run and closely observe for a few days, and let you know here. Thank you for your help.
U got mail at koston.org. About the second .svg I’m sure it cought it, the first one fifty/fifty.
Yeah that brought up the usual top-like screen…
Where did this put the record file?
Edit: it’s there, but not visible in the smb share…
cp 66-2023-03-02T20:24:26+01:00.svg 66.svgrevealed it