core: Shelly 2.5 switches unavailable in HA when upgrading to 0.118.4 with Shelly vanilla firmware ⩾ 1.9.0

The problem

When upgrading from 0.117(.?) to 0.118.4, my Shelly 2.5 devices (FW 1.9.2) became unavailable. To be more precise:

  • my fleet of Shelly Dimmers and Shelly 1/1PM still interfaces with HA like a charm;
  • my Shelly 2.5 switches previously added in HA no longer work (unavailable), while being up and working perfectly in direct Shelly mode (web UI, GET to /status…);
  • a new Shelly 2.5 switch I just added is being discovered in the Integrations screen and its IP address proposed, but adding it leads to a Failed to connect.

Environment

Problem-relevant configuration.yaml

No specifics in configuration.yaml at this stage.

Traceback/Error logs

2020-11-29 11:01:45 DEBUG (MainThread) [homeassistant.components.shelly] Manually updated <hidden> data
2020-11-29 11:01:56 DEBUG (MainThread) [homeassistant.components.shelly] Manually updated <hidden> data
2020-11-29 11:01:56 WARNING (MainThread) [homeassistant.config_entries] Config entry for shelly not ready yet. Retrying in 80 seconds
2020-11-29 11:01:56 WARNING (MainThread) [homeassistant.config_entries] Config entry for shelly not ready yet. Retrying in 80 seconds
2020-11-29 11:01:56 WARNING (MainThread) [homeassistant.config_entries] Config entry for shelly not ready yet. Retrying in 80 seconds
2020-11-29 11:02:00 DEBUG (MainThread) [homeassistant.components.shelly] Manually updated <hidden> data
2020-11-29 11:02:00 WARNING (MainThread) [homeassistant.config_entries] Config entry for shelly not ready yet. Retrying in 80 seconds
2020-11-29 11:02:00 DEBUG (MainThread) [homeassistant.components.shelly] Manually updated <hidden> data
2020-11-29 11:02:00 DEBUG (MainThread) [homeassistant.components.shelly] Manually updated <hidden> data

Manually updated lines refer to Shelly devices that still work with HA (I just hide their names), while – I guess – the four Config entry for shelly not ready yet lines refer to the failing interfacing of the 2.5 switches. Of course this log extract happens very periodically within the core container.

Additional information

Shelly 2.5 upgraded to last firmware 20201128-102046/v1.9.2 Working Shelly Dimmers & 1 upgraded to last firmware 20201124-092706/v1.9.0 Cannot say if 0.118.4 did work with previous version of the Shelly firmwares since I updated all at once (my bad).

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 32 (16 by maintainers)

Most upvoted comments

@thecode, I confirm this now works with FW 1.9.3-rc2, and will definitely investigate why these UDP fragmented packets are dropped (couldn’t find anything relevant at this stage linked to some Synology OS specifics).

Thank you all for your help and time!

And some UDP traces (I dropped advertising frames):

  • 1st one antoine-radiateur is w/ FW 1.8.5 when clicking on Submit;
  • 2nd one parents-radiateur is w/ FW 1.9.3-rc1 when clicking on Submit.
ash-4.3# tcpdump 'ip host 192.168.185.47 and udp port 5683' -vv
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

22:09:43.962994 IP (tos 0x0, ttl 64, id 28322, offset 0, flags [DF], proto UDP (17), length 39)
    patator.5683 > antoine-radiateur.5683: [bad udp cksum 0xf3dd -> 0x7260!] UDP, length 11
22:09:44.046624 IP (tos 0x0, ttl 255, id 1683, offset 0, flags [none], proto UDP (17), length 1177)
    antoine-radiateur.5683 > patator.5683: [udp sum ok] UDP, length 1149
22:09:44.047416 IP (tos 0x0, ttl 64, id 28405, offset 0, flags [DF], proto UDP (17), length 39)
    patator.5683 > antoine-radiateur.5683: [bad udp cksum 0xf3dd -> 0x7251!] UDP, length 11
22:09:44.056917 IP (tos 0x0, ttl 255, id 1685, offset 0, flags [none], proto UDP (17), length 299)
    antoine-radiateur.5683 > patator.5683: [udp sum ok] UDP, length 271
22:09:45.864727 IP (tos 0x0, ttl 64, id 28766, offset 0, flags [DF], proto UDP (17), length 39)
    patator.5683 > antoine-radiateur.5683: [bad udp cksum 0xf3dd -> 0x7260!] UDP, length 11
22:09:45.894776 IP (tos 0x0, ttl 255, id 1705, offset 0, flags [none], proto UDP (17), length 1177)
    antoine-radiateur.5683 > patator.5683: [udp sum ok] UDP, length 1149
22:09:45.895435 IP (tos 0x0, ttl 64, id 28792, offset 0, flags [DF], proto UDP (17), length 39)
    patator.5683 > antoine-radiateur.5683: [bad udp cksum 0xf3dd -> 0x7251!] UDP, length 11
22:10:00.860918 IP (tos 0x0, ttl 64, id 33787, offset 0, flags [DF], proto UDP (17), length 39)
    patator.5683 > antoine-radiateur.5683: [bad udp cksum 0xf3dd -> 0x7260!] UDP, length 11
22:10:00.879612 IP (tos 0x0, ttl 255, id 1728, offset 0, flags [none], proto UDP (17), length 1177)
    antoine-radiateur.5683 > patator.5683: [udp sum ok] UDP, length 1149
22:10:00.880193 IP (tos 0x0, ttl 64, id 33797, offset 0, flags [DF], proto UDP (17), length 39)
    patator.5683 > antoine-radiateur.5683: [bad udp cksum 0xf3dd -> 0x7251!] UDP, length 11
22:10:00.929138 IP (tos 0x0, ttl 255, id 1729, offset 0, flags [none], proto UDP (17), length 299)
    antoine-radiateur.5683 > patator.5683: [udp sum ok] UDP, length 271
^C
14 packets captured
15 packets received by filter
0 packets dropped by kernel


ash-4.3# tcpdump 'ip host 192.168.185.198 and udp port 5683' -vv
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

22:10:22.581315 IP (tos 0x0, ttl 64, id 43347, offset 0, flags [DF], proto UDP (17), length 39)
    patator.5683 > parents-radiateur.5683: [bad udp cksum 0xf474 -> 0x71c9!] UDP, length 11
22:10:22.641568 IP (tos 0x0, ttl 255, id 10739, offset 0, flags [+], proto UDP (17), length 1500)
    parents-radiateur.5683 > patator.5683: UDP, bad length 1485 > 1472 <===== !

Could you downgrade firmware to 1.9.0 on one of the problematic devices?

You can update your Shelly 2.5 with the same URL as before, this should solve the problem for now, but I suggest to check why fragmented UDP packets are dropped on your system

I’ve found the root cause you are experiencing with firmware versions >1.8.5 for Shelly 2.5 The events from the device are based on UDP CoAP message, this message was increased in version 1.9.0 in order to provide future functionally for the device. Due to this increase this UDP packet is fragmented into two packets and reassembled, some systems does not reassemble this packet correctly. Please record traffic from device IP and port 5683. In addition try to check if this is a known issue for your system and what are the options to fix it. I’m discussing with Shelly Devs what can be done to overcome this problem in another way. If you have Shelly 3EM, it is already using a large CoAP packet and should not work for you.

I’ve prepared a modified integration for you. You can use it as a custom component or replace config_flow.py and __init__.py in your HA installation. Repo: https://github.com/bieniu/ha-shelly-test-1.9.2

You can downgrade using this url: http://__ip__/ota?url=http://archive.shelly-faq.de/version/v1.9.0/SHSW-25.zip Replace __ip__ with your device IP address.