core: HA is not reconnecting to ESPHome devices
The problem
Over the past few months I have been experiencing an issue where HA would disconnect from an ESPHome device and not try to reconnect. I included below some of the logs I am seeing. This issue is very statistical, I saw it happen once a month and I saw it happen a couple of times a week (even once a day).
Even more strange is the fact that none of these logs are at the time of disconnection (in the example below - the ESPHome device is disconnected from HA since 1PM and these logs are from around 10AM and there are no logs regarding ESPHome from around 1PM)
The ESPHome device itself is connected to the WiFi network and is accessible via the web interface. I am also using an API password and encryption key, a static IP, power save mode NONE and other recommended features enabled that should have fixed this issue. (see YAML example below).
I tried to reset the ESPHome device and update its ESPHome version - but it did not solve the issue (HA is still disconnected). However, when I reload the ESPHome integration - it all works (until it disconnects again). This is what led me to believe that the issue is probably in HA and not in ESPHome.
If there’s anything else I can provide to help debug this issue, please let me know. Thank you in advance!
What version of Home Assistant Core has the issue?
core-2022.2.0
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ESPHome
Link to integration documentation on our website
https://www.home-assistant.io/integrations/esphome/
Example YAML snippet
esphome:
name: "esp2"
platform: "ESP8266"
board: "esp01_1m"
logger:
api:
password: !secret api_password
encryption:
key: !secret api_key
ota:
password: !secret ota_password
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
power_save_mode: NONE
fast_connect: on
reboot_timeout: 5min
manual_ip:
static_ip: "****"
gateway: !secret wifi_gateway
subnet: !secret wifi_subnet
dns1: !secret wifi_dns1
dns2: !secret wifi_dns2
ap:
ssid: !secret fallback_wifi_ssid
password: !secret fallback_wifi_password
ap_timeout: 2min
captive_portal:
web_server:
port: 80
include_internal: true
sensor:
- platform: wifi_signal
name: "WiFi Signal Sensor"
internal: true
update_interval: 60s
output:
- platform: esp8266_pwm
pin: 12
frequency: 1000 Hz
id: pwm_b
- platform: esp8266_pwm
pin: 13
frequency: 1000 Hz
id: pwm_g
- platform: esp8266_pwm
pin: 15
frequency: 1000 Hz
id: pwm_r
light:
- platform: rgb
name: "Light"
id: esp2_light
red: pwm_r
green: pwm_g
blue: pwm_b
Anything in the logs that might be useful for us?
2022-02-03 10:18:25 WARNING (MainThread) [aioesphomeapi.reconnect_logic] Can't connect to ESPHome API for esp1 @ ***: Error connecting to ('***', 6053): [Errno 113] Connect call failed ('***', 6053)
2022-02-03 10:19:09 WARNING (MainThread) [aioesphomeapi.reconnect_logic] Can't connect to ESPHome API for esp2 @ ***: Timeout waiting for response for <class 'api_pb2.ConnectRequest'>
2022-02-03 10:24:12 WARNING (MainThread) [homeassistant.components.websocket_api.http.connection] [548034806976] Disconnected: Did not receive auth message within 10 seconds
2022-02-03 10:24:32 WARNING (MainThread) [homeassistant.components.websocket_api.http.connection] [548271912416] Disconnected: Did not receive auth message within 10 seconds
Additional information
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 16 (8 by maintainers)
Thanks for the logs guys/gals!
Using the logs I was able to identify one real problem, when it’s merged perhaps you can check if https://github.com/home-assistant/core/pull/66189 fixes the issue for you.
Specifically (from the logs from @RobinMarsollier) here’s an example.
First the device disconnects, and a reconnect is immediately started
That socket is opened successfully and a TCP connection established, as we can see by this later message
https://github.com/esphome/aioesphomeapi/blob/v10.8.1/aioesphomeapi/connection.py#L167
Next is initializing the frame helper and performing the handshake, and indeed we can see an (empty) ClientHello handshake frame being transmitted in the next log line
https://github.com/esphome/aioesphomeapi/blob/v10.8.1/aioesphomeapi/_frame_helper.py#L148
Now the logging here isn’t great (unable to see which
Received framecorresponds to which connection), but I think what’s happening is the connection is silently closed (for example by the device dropping off the network at just the right time), and so the read_frame to get the server’s response just hangs forever.That is a bug and is fixed by https://github.com/home-assistant/core/pull/66189, but I can’t say with certainty if it’s the issue you’re seeing.