core: Memory Leak in ONVIF

The problem

I believe the ONVIF integration has a memory leak. Not sure if it’s in the core code or module dependencies.

I was experiencing frequent (1-3 times/day) forced restarts of Core by Supervisor, which I was able to trace back to OOM calls on the host vm killing the python process. Tried to narrow down the source as best as I could with help from @bdraco and it looks like we have it narrowed down to the ONVIF integration.

See memory sensor snapshot before and after disabling ONVIF on my main instance: Prod VM Memory Sensor

To further narrow it down, I started a new Home Assistant OS VM (Proxmox) with only the original ONVIF config entries, profiler, and system resources integrations enabled on top of the default_config. It looks like it is showing the same memory leak here.

Dev VM Memory Sensor

180-sec Py-Spy, profiler.memory and profiler.start outputs from test VM attached under Additional Info below. I can attach from main instance too if needed.

What version of Home Assistant Core has the issue?

2022.5.5

What was the last working version of Home Assistant Core?

~2022.2.x

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ONVIF

Link to integration documentation on our website

https://www.home-assistant.io/integrations/onvif

Diagnostics information

config_entry-onvif-cd09f9939eda6d950f4b80d7b1bac047.json.txt config_entry-onvif-ef7f4c3b80516522e1dadc5448f7c8a5.json.txt config_entry-onvif-97da2d4278d518eaa41f437b414cd8d1.json.txt config_entry-onvif-295e1346abdabb24753cd7425e492c04.json.txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

## System Health

version | core-2022.5.5
-- | --
installation_type | Home Assistant OS
dev | false
hassio | true
docker | true
user | root
virtualenv | false
python_version | 3.9.9
os_name | Linux
os_version | 5.15.41
arch | x86_64
timezone | America/Chicago

<details><summary>Home Assistant Cloud</summary>

logged_in | false
-- | --
can_reach_cert_server | ok
can_reach_cloud_auth | ok
can_reach_cloud | ok

</details>

<details><summary>Home Assistant Supervisor</summary>

host_os | Home Assistant OS 8.1
-- | --
update_channel | stable
supervisor_version | supervisor-2022.05.3
docker_version | 20.10.14
disk_total | 30.8 GB
disk_used | 3.2 GB
healthy | true
supported | true
board | ova
supervisor_api | ok
version_api | ok
installed_addons | Studio Code Server (5.0.4), Samba share (9.6.1)

</details>

<details><summary>Dashboards</summary>

dashboards | 1
-- | --
resources | 0
mode | auto-gen

</details>

Additional information

Py-Spy and Profiler - Test Instance.zip Py-Spy and Profiler - Main Instance.zip

Log File from Test VM prior to crash, profiler.start_log_objects running and logging as follows:

logger:
  default: info
  logs:
    homeassistant.components.onvif: debug
    onvif: debug
    wsdiscovery: debug
    zeep: debug

home-assistant.log.1.txt

Potential historical issues related:

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 28 (25 by maintainers)

Most upvoted comments

LGTM:

image

@shbatm - That makes sense to me. Sorry for the delay, but it’s been a long weekend. v1.2.1 of onvif-zeep-async is now up on PyPi with the verify=False change. Would you mind re-doing your Firewall rules and testing with that version installed? Assuming that fixes the leak (although not the underlying problem) I’ll try to get a patch in for the 2022.06 release.