core: MemoryError and crash of python and homeassistant docker container when using kef integration

The problem

The python3 process running homeassistant died. The “homeassistant” container crashed.

Looking at the home-assistant log (/mnt/data/supervisor/homeassistant/home-assistant.log from HassOS) the final entry is at midnight local time (15:00 UTC) and this is reflected in the last-modified timestamp of the file.

2020-06-03 00:00:18 WARNING (zeroconf-Engine-1848242208) [zeroconf] Exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/zeroconf/__init__.py", line 1292, in handle_read
MemoryError

Even though the MemoryError is written here, and the process doesn’t fully die until an hour later, the high memory consumption and CPU usage spikes about an hour prior.

At around 13:55:11 UTC is where memory and CPU utilization begins to climb.

With some debug logging enabled, I also see that even “DEBUG” log entries from homeassistant.core cease. Final DEBUG entry is at 22:55:01 local time (13:55:01 UTC) just prior to . The debug entries are only regular sensor update information, “Bus:Handling <Event state_changed[L]> …”.

Over the following hour, for “python3 -m homeassistant --config /config” the CPU is pegged at 100% and memory utilization climbs from a typical mere 335 MB VIRT (200 MB RES) to 1169 MB VIRT (~700 MB RES) in just 5 minutes (13:59:52 UTC). CPU utilization hangs back a bit (possibly because the SWAP isn’t being hammered.)

It kicks up again at 14:05:27 UTC, with 100% CPU and gradually climbing memory usage. It plateaus again at 14:06:33 UTC with 1384 MB VIRT (740 MB RES).

This remains stable until 14:19:12 UTC, going up further to 1652 MB VIRT by 14:20:34 UTC. Rinse and repeat at 14:37:15 UTC, climbing to 1987 MB VIRT by 14:39:02 UTC. 15:00:02 UTC we climb again, to 2029 MB VIRT by 15:00:20 UTC. This is around the time that MemoryError happens. The python3 process lives for another hour at the same memory utilization and approximately 15% CPU utilization on average. Then it dies and goes away.

https://community.home-assistant.io/t/memory-exhausted-by-python3-process-process-container-crash/201011/2

Environment

Home Assistant Core release with the issue: I’ve had similar memory exhaustion issues with home assistant since first installing. 0.109, and several 0.110.x. Lack of information in logs makes it impossible to determine the cause. Most recently 0.110.4, with HassOS 3.13.
Last working Home Assistant Core release (if known): N/A
Operating environment (Home Assistant/Supervised/Docker/venv): Raspberry Pi 3, HassOS based installation from recommended 32-bit image
Integration causing this issue: N/A
Link to integration documentation on our website: N/A

Problem-relevant `configuration.yaml`

First happened with default configuration.yaml, only loaded my Google Home units (4x Home Mini, 1x Nest Mini, 1x Chromecast, 1x Chromecast Audio, 1x JBL speaker) and Philips Hue bridge with one light. Since then I’ve added more, but with the lack of logs I cannot say if it’s related to any specific configuration or the base Home Assistant core system.

Traceback/Error logs

DEBUG was active on homeassistant.core and it stops producing any logs about an hour prior to MemoryError being output into the home-assistant.log. See the detailed flow of events above. The memory utilization of python starts to grow at the same time as the homeassistant.core DEBUG logging stops.

MemoryError

Additional information

I’ve tried to collect as many logs as possible but still cannot see what is triggering this runaway memory usage which results in a crash.

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 30 (27 by maintainers)

Most upvoted comments

That was the old implementation by the way. I just don’t understand how this could lead to the errors.

The rationale behind it code is that every time after communicating with the device, that the connection is broken off after two seconds. I do this because when there is a connection, one cannot use the KEF app for example (also reconnecting is cheap).

basnijholt on Jun 9, 2020

I am not sure why the memory error or high CPU usage happens.

Before aiokef did a very cheap check 2 times per second to see if the connection with the speaker should be closed. When it was actually closed, the check continued. Now it schedules a task to close the connection at a specific time and extends that time if a new operation is performed.

Again, I am not sure why this would actually be a problem, but since this resulted in many calls I decided to fix it (which seems to work).

basnijholt on Jun 9, 2020

I’ve had py-spy recording while everything froze up. This is the output, where I see that aiokef (used in the kef integration and my own creation 😅) mostly occupied the event-loop. I’ve fixed this in https://github.com/home-assistant/core/pull/36574.

Since I’ve made that change, I have not seen the problem anymore. I’ve also updated to core==0.111.0b4.

@logan893, do you use the kef integration by any chance?

basnijholt on Jun 9, 2020

core: MemoryError and crash of python and homeassistant docker container when using kef integration

The problem

Environment

Problem-relevant configuration.yaml

Traceback/Error logs

Additional information

About this issue

Most upvoted comments

Problem-relevant `configuration.yaml`