operating-system: Device permission update broken for some devices

RaspberryMatic users report issues with Homeatic devices not being able to passed into the RaspberryMatic Add-on (see https://github.com/jens-maus/RaspberryMatic/issues/1942).

The supervisor reports the following error:

22-09-15 13:34:43 ERROR (MainThread) [supervisor.docker.addon] Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic
22-09-15 13:34:43 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-14879' coro=<DockerAddon._hardware_events() done, defined at /usr/src/supervisor/supervisor/jobs/decorator.py:85> exception=DockerError("Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/docker/addon.py", line 720, in _hardware_events
    await self.sys_dbus.agent.cgroup.add_devices_allowed(
  File "/usr/src/supervisor/supervisor/dbus/agent/cgroup.py", line 19, in add_devices_allowed
    await self.dbus.CGroup.AddDevicesAllowed(container_id, permission)
  File "/usr/src/supervisor/supervisor/utils/dbus.py", line 174, in call_dbus
    raise DBusFatalError(reply.body[0])
supervisor.exceptions.DBusFatalError: Error calling runc for '87542225ef424bd38c4d49ff3db5a8c46b277fdee1c3efd1a6aee5edd57d9d80': exit status 1, output time="2022-09-15T11:34:43Z" level=warning msg="could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}"
time="2022-09-15T11:34:43Z" level=info msg="found more than one filter (2) attached to a cgroup -- removing extra filters!"
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 0 from cgroup" id=139 name= run_count=0 runtime=0s tag=f231b56d360e591c type=CGroupDevice
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 1 from cgroup" id=140 name= run_count=0 runtime=0s tag=be0b8d7ca6afd4d8 type=CGroupDevice
time="2022-09-15T11:34:43Z" level=warning msg="could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}"
time="2022-09-15T11:34:43Z" level=info msg="found more than one filter (2) attached to a cgroup -- removing extra filters!"
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 0 from cgroup" id=144 name= run_count=0 runtime=0s tag=839fa1b30f0f67c3 type=CGroupDevice
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 1 from cgroup" id=147 name= run_count=0 runtime=0s tag=be0b8d7ca6afd4d8 type=CGroupDevice
time="2022-09-15T11:34:43Z" level=error msg="failed to call BPF_PROG_DETACH (BPF_CGROUP_DEVICE) on old filter program: can't detach program: no such file or directory"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 141, in wrapper
    raise err
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 139, in wrapper
    return await self._method(*args, **kwargs)
  File "/usr/src/supervisor/supervisor/docker/addon.py", line 727, in _hardware_events
    raise DockerError(
supervisor.exceptions.DockerError: Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic
22-09-15 13:34:43 ERROR (MainThread) [supervisor.docker.addon] Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 23 (20 by maintainers)

Commits related to this issue

Most upvoted comments

Sorry, I really don’t know. Usually I restart after updating, but I’m new to HA and restarting is really not that automated for me yet 😦 So it might be the case that I forgot to restart.

Further investigation actually uncovered that the root problem is this error:

time="2022-09-15T11:34:43Z" level=error msg="failed to call BPF_PROG_DETACH (BPF_CGROUP_DEVICE) on old filter program: can't detach program: no such file or directory"

This causes runc to fail, which probably causes rules to not get applied correctly. This is likely caused by parallel calls to OS Agent/runc, due to multiple devices getting added at once. The following fix prevents Supervisor from calling OS Agent/runc twice at once https://github.com/home-assistant/supervisor/pull/3868.