core: After update from 2023.2.5 to 2023.3.6 esphome bluetooth stack takes 25 to retry connecting after error (error not seen)
The problem
As a topic. As we know, since 2023.3 update we had problem with error index array out of range. We have update HA 2023.3.6 and this problem was fixed, but, new problem happends.
Lock FW is newest
Iam using BT-proxy - just say this to know that BT connections and signal strength should be very fine. Hardware with High performance bt dongle, 3m USB cable, away from USB 3.0 ports. Till 2023.x HA everything working very well.
Now problem is, that lock goes offline. Not unavialable, just offline - this says log. Cant connect to it, via HA. BT proxy log, says this, HA logs too. Connection takes over 50 seconds. I must say, that if i using Switchbot app - everything working ok, lock/unlock takes less than 1 sec.
And i have other devices from Switchbot and everyone working ok, without problem, without delay etc.
What version of Home Assistant Core has the issue?
2023.3.6
What was the last working version of Home Assistant Core?
2023.2.x
What type of installation are you running?
Home Assistant OS
Integration causing the issue
SwitchBot
Link to integration documentation on our website
https://www.home-assistant.io/integrations/switchbot
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
C1:89:44:4B:2B:A5
From bbt proxy log:
[13:14:00][I][bluetooth_proxy:250]: [0] [lockmacaddress] Connecting v3 with cache
[13:14:00][D][esp32_ble_tracker:214]: Pausing scan to make connection...
[13:14:00][I][esp32_ble_client:064]: [0] [lockmacaddress] 0x01 Attempting BLE connection
......
[13:14:01][W][esp32_ble_client:134]: [0] [] Connection failed, status=133
.....
.....
[13:14:51][D][esp-idf:000]: W (60434078) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x3e
as You see, from unlock command to completed and lock unlock - 51sec
from HA logs:
Logger: homeassistant.core
Source: components/switchbot/lock.py:54
First occurred: 12:43:35 (1 occurrences)
Last logged: 12:43:35
Error executing service: <ServiceCall lock.unlock (c:01GWC96ENHARJM7KZGYPZCJM69): entity_id=['lock.zamek_switchbot']>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/client.py", line 550, in bluetooth_device_connect
await event.wait()
File "/usr/local/lib/python3.10/asyncio/locks.py", line 214, in wait
await fut
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/client.py", line 549, in bluetooth_device_connect
async with async_timeout.timeout(timeout):
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/esphome/bluetooth/client.py", line 100, in _async_wrap_bluetooth_operation
return await func(self, *args, **kwargs)
File "/usr/src/homeassistant/homeassistant/components/esphome/bluetooth/client.py", line 314, in connect
await self._client.bluetooth_device_connect(
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/client.py", line 575, in bluetooth_device_connect
raise TimeoutAPIError(
aioesphomeapi.core.TimeoutAPIError: Timeout waiting for connect response while connecting to lockmacaddress after 20.0s, disconnect timed out: True, after 5.0s
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 342, in establish_connection
await client.connect(
File "/usr/src/homeassistant/homeassistant/components/bluetooth/wrappers.py", line 249, in connect
connected = await super().connect(**kwargs)
File "/usr/local/lib/python3.10/site-packages/bleak/__init__.py", line 471, in connect
return await self._backend.connect(**kwargs)
File "/usr/src/homeassistant/homeassistant/components/esphome/bluetooth/client.py", line 102, in _async_wrap_bluetooth_operation
raise asyncio.TimeoutError(str(err)) from err
asyncio.exceptions.TimeoutError: Timeout waiting for connect response while connecting to lockmacaddress after 20.0s, disconnect timed out: True, after 5.0s
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/core.py", line 1826, in catch_exceptions
await coro_or_task
File "/usr/src/homeassistant/homeassistant/core.py", line 1845, in _execute_service
await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service
await service.entity_service_call(
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 686, in entity_service_call
future.result() # pop exception if have
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 961, in async_request_call
await coro
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 726, in _handle_entity_call
await result
File "/usr/src/homeassistant/homeassistant/components/lock/__init__.py", line 106, in _async_unlock
await entity.async_unlock(**remove_entity_service_fields(service_call))
File "/usr/src/homeassistant/homeassistant/components/switchbot/lock.py", line 54, in async_unlock
self._last_run_success = await self._device.unlock()
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/lock.py", line 168, in unlock
return await self._lock_unlock(
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/lock.py", line 189, in _lock_unlock
await self._enable_notifications()
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/lock.py", line 255, in _enable_notifications
result = await self._send_command(COMMAND_ENABLE_NOTIFICATIONS)
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/lock.py", line 299, in _send_command
result = await self._ensure_encryption_initialized()
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/lock.py", line 314, in _ensure_encryption_initialized
result = await self._send_command(
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/lock.py", line 297, in _send_command
return await super()._send_command(key[:2] + "000000" + key[2:], retry)
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/device.py", line 180, in _send_command
return await self._send_command_locked(key, command)
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/device.py", line 397, in _send_command_locked
await self._ensure_connected()
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/device.py", line 264, in _ensure_connected
client: BleakClientWithServiceCache = await establish_connection(
File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 359, in establish_connection
_raise_if_needed(name, device.address, exc)
File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 308, in _raise_if_needed
raise BleakNotFoundError(msg) from exc
bleak_retry_connector.BleakNotFoundError: (lockmacaddress) - lockmacaddress: Failed to connect: Timeout waiting for connect response while connecting to lockmacaddress after 20.0s, disconnect timed out: True, after 5.0s
and
Logger: switchbot.devices.device
Source: /usr/local/lib/python3.10/site-packages/switchbot/devices/device.py:182
First occurred: 12:43:35 (1 occurrences)
Last logged: 12:43:35
(lockmacaddress): device not found, no longer in range, or poor RSSI: -73
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/client.py", line 550, in bluetooth_device_connect
await event.wait()
File "/usr/local/lib/python3.10/asyncio/locks.py", line 214, in wait
await fut
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/client.py", line 549, in bluetooth_device_connect
async with async_timeout.timeout(timeout):
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/esphome/bluetooth/client.py", line 100, in _async_wrap_bluetooth_operation
return await func(self, *args, **kwargs)
File "/usr/src/homeassistant/homeassistant/components/esphome/bluetooth/client.py", line 314, in connect
await self._client.bluetooth_device_connect(
File "/usr/local/lib/python3.10/site-packages/aioesphomeapi/client.py", line 575, in bluetooth_device_connect
raise TimeoutAPIError(
aioesphomeapi.core.TimeoutAPIError: Timeout waiting for connect response while connecting to lockmacaddress after 20.0s, disconnect timed out: True, after 5.0s
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 342, in establish_connection
await client.connect(
File "/usr/src/homeassistant/homeassistant/components/bluetooth/wrappers.py", line 249, in connect
connected = await super().connect(**kwargs)
File "/usr/local/lib/python3.10/site-packages/bleak/__init__.py", line 471, in connect
return await self._backend.connect(**kwargs)
File "/usr/src/homeassistant/homeassistant/components/esphome/bluetooth/client.py", line 102, in _async_wrap_bluetooth_operation
raise asyncio.TimeoutError(str(err)) from err
asyncio.exceptions.TimeoutError: Timeout waiting for connect response while connecting to lockmacaddress after 20.0s, disconnect timed out: True, after 5.0s
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/device.py", line 180, in _send_command
return await self._send_command_locked(key, command)
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/device.py", line 397, in _send_command_locked
await self._ensure_connected()
File "/usr/local/lib/python3.10/site-packages/switchbot/devices/device.py", line 264, in _ensure_connected
client: BleakClientWithServiceCache = await establish_connection(
File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 359, in establish_connection
_raise_if_needed(name, device.address, exc)
File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 308, in _raise_if_needed
raise BleakNotFoundError(msg) from exc
bleak_retry_connector.BleakNotFoundError: (lockmacaddress) - lockmacaddress: Failed to connect: Timeout waiting for connect response while connecting to lockmacaddress after 20.0s, disconnect timed out: True, after 5.0s
Additional information
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 101 (39 by maintainers)
I also fixed the disconnection race while I reworked that esphome code so it should be a bit more reliable as well
ok, will open new issue. thanks
ok iam for 99% sure that problem solved. Even if first time connection failes, reconnect is right away and finally second time lock opened. Its takes about 2 sec and its very ok.
I didn’t realize you were trying on the old version. It will only work on 2023.5.x
Yes thats the goal.
It looks like it won’t work because of the conflict with the sub dep. Thats a shame as I would really have liked to get a good test but I’m 99% sure I’ve solved it so go ahead and remove the custom version and let’s see if 2023.5.3 does the trick
Helps to be looking in the right place.
Thanks @KitHubek for all the debugging and logs 👍
@KitHubek I’ve confident the linked PR will fix the issue, but it would be great to get a test
Verify that the version displayed above is “aioesphomeapi==13.7.4” and NOT 3.7.3
Restart HA
Test
Delete
/config/custom_components/esphomeand restart after testing is completed to restore the core versionhttps://github.com/home-assistant/core/pull/new/ble_retry_fix
The fix is in this branch.
Github is having problems at the moment so I can’t create a PR to fix it
https://github.com/esphome/aioesphomeapi/pull/391 is definitely the cause of the breakage
So I suspect what is happening is the connection is initially successful that than we get a follow message that the connection has failed and since https://github.com/esphome/aioesphomeapi/pull/391 now resolves the connect right away we never see the error and we have to wait the full timeout
related PRs during that timeframe
https://github.com/esphome/aioesphomeapi/pull/391
Hey there @ottowinter, @jesserockz, mind taking a look at this issue as it has been labeled with an integration (
esphome) you are listed as a code owner for? Thanks!Code owner commands
Code owners of
esphomecan trigger bot actions by commenting:@home-assistant closeCloses the issue.@home-assistant rename Awesome new titleRenames the issue.@home-assistant reopenReopen the issue.@home-assistant unassign esphomeRemoves the current integration label and assignees on the issue, add the integration domain after the command.(message by CodeOwnersMention)
esphome documentation esphome source (message by IssueLinks)
In the mean time can you turn off logs for bleak and turn on logs for aioesphomeapi and homeassistant.components.esphome and get it to fail to connect again?
It’s good to know that the problem isn’t in the SwitchBot integration and it’s deeper in the Bluetooth stack.
I’ve spent most of the available time digging through the SwitchBot code so I’ve put the effort in the wrong place.
My day job is really busy right now but I will try to find some time this week to dig through the Bluetooth stack (it’s 7 or 8 libraries) and see if I can at least get a better way to debug this
So the problem is likely that the error isn’t being propagated back. 25 seconds is the timeout to try again if we get no response
ok, when avialable 2023.3.7 will update and send logs here. For now stays on 2023.2.5 because its lastes ver. working ok