core: Permanent error: Client exceeded max pending messages [2]: 512
Home Assistant release with the issue:
HA 0.98.5
Last working Home Assistant release (if known): HA 0.91 (0.92.0?)
Operating environment (Hass.io/Docker/Windows/etc.):
HASS.io/Docker/Raspbian Buster
Component/platform:
homeassistant.components.websocket_api.http.connection
Description of problem: I have to reopen the issue (https://github.com/home-assistant/home-assistant/issues/23938) since it was unreasonably closed (some people have the same problem but in different environment). To be more exact, no use of Node-Red and no HA automation restarts but issue is till on.
Problem-relevant configuration.yaml
entries and (fill out even if it seems unimportant):
Tons of errors in error log with no any clear reason. I can have more than 15.000 such errors a day.
Traceback (if applicable):
ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.xxxx] Client exceeded max pending messages [2]: 512
Additional information:
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 14
- Comments: 76 (16 by maintainers)
Found what was causing these errors for me and can reproduce 100%
Standard browser refresh clears the error loop.
So I guess I just have to remember to refresh my browser after editing a card.
Possibly a polymer/frontend issue with how HA attempts to refresh Firefox.
I may have a suggestion, I can’t prove it’ll work for anybody, and if my idea still requires a tweak. I was getting “Client exceeded max pending messages” because I have too many history graphs. Graphs have to query the database for entity_id, state, last_updated. I’m using mariadb so I used a tool DBEAVER to connect to HA, I took a peek at the index in homeassistant “states” table, the closest index I found that could be used for history graphs is named “ix_states_entity_id_last_updated” it has columns (entity_id, last_updated) only. so a SQL query used to retrieve the data to draw history graphs would have to use the index then read the table by rowid to retrieve state data, to plot column state on y axis, column last_updated on x axis for a specific entity_id. the index is efficient, table access is efficient but there will be extra table disk I/O to get all data needed for 1 data point in history graph. but if we have enough history graphs there will be concurrent SQL with contention on the table disk I/O portion.
My hunch for my case this is where the bottleneck is for my hardware rbi3b+ with a solid state drive. as an experiment I’ve added a new Index “ix_states_entity_id_last_updated2” to help optimize the query just slightly so it wouldn’t have to access the table to get the needed data, the index storage size will go up slightly, but everything the query needs will be in the index and less disk I/O
here’s the DDL to add the index for mariadb, if you don’t think the index can be unique for the 3 column combination then remove the UNIQUE keyword.
here’s the DDL to drop the index for mariadb, after you’re done with the experiment
I’m still looking at my logs for the max pending messages, I haven’t seen it lately, that could be because of my experiment or HA version 2021.11.5 helped on the solid state drive hardware I use. its very likely my new index is ignored completely. I’m still keeping an eye on my logs. In an Oracle database if you had enough concurrent SQL queries, such an index would have made the optimizer use it and reduce contention on the table, in Oracle Index data are kept in cache better than table data. Not sure if Mariadb would behave similarly to the index tweak as an Oracle database,
if the experiment is successful, than the index “ix_states_entity_id_last_updated” would be replaced with the change, we shoudn’t add another index unless there is a need, the rbpi3b+ has only so much ram to cache indexes.
the above DDL would be changed to, that is if mariadb allows create or replace from an index that is not unique to an index that is unique. if it doesn’t you would have to drop the index 1st, then add it.
Hi @pierre2113, I use ZWave USB in HA integration, however, it wasn’t my ZWave integration after all. I think I have managed to track my issue down to a script which I use to control one of my wall tablets running Fully Kiosk. I am now in the process of working out why it is causing this especially since it is identical to three others doing the same thing.
https://github.com/aio-libs/aiohttp/pull/4993 should help this
I think is related to chrome tabs that remains open for long time. More details below:
Probably Hass and Frontend does not manage correctly the socket disconnection-reconnection. Hass must stop sending data through websocket after first pending messages and meanwhile Frontend must attempt a reconnection (like when hass is restarted but ui is keeped open)
Hope it helps