core: Automations referencing non-existing devices do not work after installing 2023.11.0b1

The problem

After upgrading to 2023.11.0b1 from b0, multiple device IDs changed. This caused multiple automations to break. Devices that had their IDs changed were from both Zigbee2MQTT and ZwaveJS-UI. There may be devices from other integrations as well, but those are the only 2 that impacted me via automations.

What version of Home Assistant Core has the issue?

2023.11.0b1

What was the last working version of Home Assistant Core?

2023.11.0b0

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Zigbee2MQTT, ZwaveJS-UI

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2023-10-27 18:02:34.118 ERROR (MainThread) [homeassistant.components.automation] Automation with alias 'Worker - Bedroom Low Light' failed to setup actions and has been disabled: Unknown device 'f559286722b67dfb86349bb409f3806e'
2023-10-27 18:02:34.134 ERROR (MainThread) [homeassistant.components.automation] Automation with alias 'Presence - Arrived Home' failed to setup actions and has been disabled: Unknown device '491487c43244c1c27f2e48a961bad702'
2023-10-27 18:02:34.135 ERROR (MainThread) [homeassistant.components.automation] Automation with alias 'Mode Trigger - Night' failed to setup actions and has been disabled: Unknown device '491487c43244c1c27f2e48a961bad702'
2023-10-27 18:02:34.182 ERROR (MainThread) [homeassistant.components.automation] Automation with alias 'Worker - Master Closet Light v2' failed to setup actions and has been disabled: Unknown device 'f61ed5009889ffbb96884c8d9f4a60f9'
2023-10-27 18:02:34.827 ERROR (MainThread) [homeassistant.components.automation] Automation with alias 'Lock - Do stuff when code used' failed to setup actions and has been disabled: Unknown device '02510f7e01be61614d6e50dcf497ce4c'
2023-10-27 18:02:35.737 ERROR (MainThread) [homeassistant.components.automation] Automation with alias 'Lighting - Office Button' failed to setup actions and has been disabled: Unknown device '3271e90d5dc11f6f439840dd4ac8af39'

Additional information

No response

About this issue

  • Original URL
  • State: open
  • Created 8 months ago
  • Reactions: 25
  • Comments: 73 (21 by maintainers)

Most upvoted comments

I have over a hundred automations affected. I am slowly rebuilding everything, but keep shaking my head over the fact that even if you disable the problem device or entity the automation remains “unavailable” This just doesn’t make any sense to me. For example if you have a device that dies, then it kills your automation, which could have a dozen or so other entries in it that do work. You should just be able to disable the problem entry and the automation should continue to run…

I think underlying this issue is a fundamental problem, it’s that programmers want these scrambled IDs to make it more logical for them, while users want human readable and identifiable IDs that make sense to them and allow with one look to know what’s going on. Now the question is, who is home assistant for… I for one are not a machine or programming scientist…that’s why I have still not updated to anything 2022.11 as it broke 60% of my automations. As I understand this issue came to light due to a change from deviceId to entity I’d, but the problem for me is, any broken automation in 2022.11 only gives me that unreadable machine code I’d, so how the heck am I to know which entity I’m supposed to enter there? I won’t go and manually write down the device IDs for 50 automations in 2022.10 to be able to find them in 2022.11, I might as well start all over. I simply don’t get why these scrambled IDs have to be there at all, when it’s enough to make sure the human readable IDs are unique. You tell me which of the below pictures makes sense to a human, I for one can’t make anny sense of those yaml IDs at all, why do they need to be there?

Screenshot_2023-12-02-10-07-50-50_40deb401b9ffe8e1df2f1cc5ba480b12 Screenshot_2023-12-02-10-07-56-47_40deb401b9ffe8e1df2f1cc5ba480b12

I truly hope someone is looking into this. I have to many automations to fix. I’m still bugged by the fact that even if you disable the problem device or entity the automation remains “unavailable” This just doesn’t make any sense to me. For example if you have a device that dies, then it kills your automation, which could have a dozen or so other entries in it that do work. You should just be able to disable the problem entry and the automation should continue to run…

For example if you have a device that dies, then it kills your automation, which could have a dozen or so other entries in it that do work. You should just be able to disable the problem entry and the automation should continue to run…

I agree with this as well. I’m all for being alerted to a problem with my server (something HA already doesn’t do nearly enough). But, automations (everything in Home Assistant, really) should continue to run as best as possible while any fault condition exists to avoid disrupting the users.

Failing silently is bad. Shutting down everything over faults with part of the system is also bad.

The devs do a great job on this project and I’m thankful for it every day. I’d like to point out a scenario or two that suggests more critical thinking when making a change like this.

In most cases this change caused people to lose automation on light switches. No big deal. Some of us use HA for a few more critical things like monitoring for water leaks or maintaining other equipment, or locking doors at night.

In my case I upgraded yesterday (I use docker-compose) to grab the latest. Like always, I go through after the upgrade and verify my switches work in the UI on the app. Everything functioned. I went to bed, with zero indication that 2/3 of my automations didn’t work. No, I didn’t go to the automations tab or I would have seen them all in red, but there was no notification and since the UI functioned fine and controlled all my devices, including the pool pump, I didn’t worry about it. Upgrade success!

One of my automations triggers the pool pump to come on if it drops below 34F to keep it from freezing. Last night we were below freezing.

No, I didn’t burst a pipe, and shame on me for trusting an OSS project as if it were commercial grade anyway.

All but one or two of my configs were created using the UI, set once, and left alone for months or years now. To suddenly have 2/3 of my automations fail because device ID’s set with the UI suddenly change is disappointing at best. The lack of any feedback that there was a problem in the notifications screen was a total failure in QA and UI design. I’m sure one or more devs realized this was going to happen to some degree and brushed it off as unimportant and a one-time glitch.

Considering the system disables the automations as “unavailable” when that happened or at startup the addition of an alert notification when it does something like that could save a lot of people major headaches, and at least let others realize there’s problems right away instead of finding out hours or days later when something fails to occur.

I’m sure like me there are people who don’t upgrade immediately following a new role-out. Adding that notification alone when an automation becomes unavailable would save a lot of people headaches in the future as they finally upgrade past this release, and probably come in handy for some future changes or problems as well.

So ultimately, at least for my case, this doesn’t look like it was caused by this beta - but this beta is bringing the issue to the surface. Which is a good thing, but is probably worth calling out as there will likely be more people that discover they have broken automations after this version.

For me, the device IDs remained unchanged which makes sense, as 2023.11 does not change device IDs, just adds some extra checks for device IDs.

@nohn thanks for this, this is exactly what it is intended by the mentioned change in #102937 (comment) Those the automations should already be broken before 2023.11, but with 2023.11 you get notified about them.

These automations for me were not broken and worked just fine right up until the update.

able to go back to 2023.10.3 and the issue has gone away

@Campagne8758 further, please could you verify, if the the supposedly in 2023.11 broken automations are really working again after the rollback?

Rolling back to 2023.10.5 made the automations work again for me without making any changes to them whatsoever.

Maybe delete the boolean helper and create a new one giving it the same entity name will solve it 🤔

that could indeed work, but I have tons of devices with this problem. I really do not want to recreate them all 😉 Its a pity that som HA changes in a past version lead to this user unfriendly behavior 😦 I still hope for a “fix”.

Yes well it is not ideal, no idea if a fix will come, because fix just will mean going back to ignore “faults” in automation. My idea is still better fix the automations now one way or a other because even if they gonna fix it by ignore the faulty automations you never know what happens in the future … I would say better be safe as sorry 😉

What “faults” though? I had automations working fine that are suddenly “broken” in 2023.11. The update did something to the device IDs that causes them to fail; there were no faults in the automations at all.

So enhanced validations in #102766 seems to be the culprit, but it just raises errors and logs. Is this something that would be a good candidate for the ‘Repairs’ system? As in, ‘we found x won’t work because y no longer exists: click here to delete y and fix the issue’. I’m interested in digging into this as a possible solution.

Better feedback is absolutely required, as all I could see in the UI is “automation disabled” without explanation. The triggers that were the cause also appeared to be fine until expanded. The header of the triggers maintained the text explanation and only upon expanding them was the device field revealed to be blank.

image image

Better feedback to the user on the cause is absolutely required. But, ideally, the update just wouldn’t break the automations at all.

So i created automations via the GUI via device. No it’s wrong and it’s my fault and i have to rebuild my entire automations??

I also created my automations with the GUI. If using a device trigger is “incorrect”, the GUI should not let us do it.

Maybe delete the boolean helper and create a new one giving it the same entity name will solve it 🤔

that could indeed work, but I have tons of devices with this problem. I really do not want to recreate them all 😉 Its a pity that som HA changes in a past version lead to this user unfriendly behavior 😦 I still hope for a “fix”.

Yes well it is not ideal, no idea if a fix will come, because fix just will mean going back to ignore “faults” in automation. My idea is still better fix the automations now one way or a other because even if they gonna fix it by ignore the faulty automations you never know what happens in the future … I would say better be safe as sorry 😉

Or maybe developers should listen to users and return human-readable names? Then, I think, there would be fewer problems? And in its current form, it feels like we are being deliberately forced into the UI, so that we do all the manipulations through the interface and don’t go into .yaml

For those of you asking if someone is looking into this, yes I am still looking into it but this would be my first contribution and like most developers here, am working here completely in my free time. Since it is my first time even looking at the codebase, there’s extra steps I have to go through. I would love to collab with someone more experienced, but if I’m indeed the only one, then here’s my progress:

My goal is to turn this into an actionable repair issue, where you can just click a button and the repair system would remove the reference to the non-existent device so your automations/scripts no longer fail. (BTW this affects scripts as well, maybe scenes too)

Progress:

  • Create a developer environment: Completed
  • Learn the basic structure of the code base: Completed
  • Find out where the issue in Automatons is: Completed
  • Find a good point in the Automatons code to create a repair issue: Completed
  • Learn how repair system actually work in-depth: In-Process
  • Use the repair system to remove the non-existent entities: Not Started Yet
  • Implement a rough fix, and test: Not Started Yet
  • Find out where else this is an issue (confirmed in Scripts): In-Process
  • Implement the fix in the other areas too: Not Started Yet
  • Learn the project code-style and review contribution guidelines: Not Started Yet
  • Bring my code up to the standard of the project: Not Started Yet
  • Submit a PR: Not Started Yet
  • Implement suggested changes in the PR process: Not Started Yet
  • …then it’s in the hands of the maintainers

For me, all the automations that had “Unknown Device” errors were using the “Device” trigger/action. When editing in GUI, it was just missing the device, as if I just added a new trigger/action/condition for a device. I simply just selected the device I needed and saved. I had about 10 to fix manually.

I do agree with you, but, the parsing mechanism could add a dedicated way to manage the “#” commented lines, as this special character is used by, and thus known by, HA parser.

This is more of a library limitation than a HA issue. HA uses PyYAML (ex: import yaml). There’s an issue that’s been open since 2017 requesting comment handling. https://github.com/yaml/pyyaml/issues/90

Just to clarify, the new way the UI for automations is to work, is by putting down these super cryptic non readable IDs instead of the before device id that was human readable? I feel this would be a major setback and kinda the opposite of what people have been asking for, to make things easier… I’m still running my restored 10.5 until I guess there is a definite conclusion or a repair tool. I really wouldnt like to have all these cryptic IDs in my automations making them pretty much unintelligible.

I have downgraded to 2023.10 and everything is fine. Now I’m looking forward to a fix in order to upgrade Any news about this issue on 2023.11.1? I don’t see they talk about a solution in this new release.

Apprently downgrading to 2023.10 is possible and doesn’t have unwanted side effects. How to identify whatever will be broken by upgrading to 2023.11 before upgrading to 2023.11 and what exactly is the problem and how to fix it?

If I understand the problem correctly, replacing all device IDs with the corresponding entity IDs will fix it. Can someone please confirm?

Same issue with my Zigbee devices. I have to modify my automations.yaml and replace all the human readable deviceIds by bluddy unreadable uid. A real nightmare, this makes also the script totally unreadable for later support.

Even complex is to retrieve which uid was hidden by the human readable device ID 😢

Dom

Could you give me a hint, we to find or set human readable id on a device?

Other topic : Unfortunately, using the Web Interface to add (recreate) automations, removes all the comments and layout (extra empty lines to separate the different ids …)

This is by design and the nature of how it works and in the background. It parses the yaml into a data structure and creates the representation in the ui out of this data structure. This data structure does not contain any comments. After you have done any changes in the ui this data structure is converted back into a valid yaml and stored. Be aware, that this is not a limitation of HA, but a common mechanic that parsing any kind of “configuration” files (eq in yaml, ini or what ever style) never takes the comments, because they are not relevant for the resulting data structure.

I do agree with you, but, the parsing mechanism could add a dedicated way to manage the “#” commented lines, as this special character is used by, and thus known by, HA parser.

I have to always use a backup of my yaml file, and, if I have to use the GUI (as it is unfortunately required now), I first take a backup of my automations.yaml (to keep all my comments), modify the automation using the GUI, analyze what has been added, restore my automations.yaml and manually add the new lines.

Lot of work, but, this is the only way I found to keep a readable and commented yaml script 😃

Comments are the basis of any developpers, isn’t it 😉

Kind regards, Dom

Could you give me a hint, we to find or set human readable id on a device?

Other topic : Unfortunately, using the Web Interface to add (recreate) automations, removes all the comments and layout (extra empty lines to separate the different ids …)

This is by design and the nature of how it works and in the background. It parses the yaml into a data structure and creates the representation in the ui out of this data structure. This data structure does not contain any comments. After you have done any changes in the ui this data structure is converted back into a valid yaml and stored. Be aware, that this is not a limitation of HA, but a common mechanic that parsing any kind of “configuration” files (eq in yaml, ini or what ever style) never takes the comments, because they are not relevant for the resulting data structure.

AFAIK there was never a human readable device id, but the entity id’s are human readable and not touched here.

Maybe did I wronly configured my HA, but I used Human Readable IDs as well for Device Ids as well for Entity Ids 😕

I upgraded to 2023.11.2, and still have to use Human Unreadable UUIDs.

Other topic : Unfortunately, using the Web Interface to add (recreate) automations, removes all the comments and layout (extra empty lines to separate the different ids …)

Enjoy a nice armistice day. Dom

So enhanced validations in #102766 seems to be the culprit, but it just raises errors and logs. Is this something that would be a good candidate for the ‘Repairs’ system? As in, ‘we found x won’t work because y no longer exists: click here to delete y and fix the issue’. I’m interested in digging into this as a possible solution.

Hi. Personnaly, I took a very long time to find the device_id and the entity_id (both are required). I edited my automations.yaml with that information, and it works. Finding the UUID is a real nightmare and makes the scripts simply unreadable, but … it works The only one thing for which I have not been able to discover both UUIS is a group of lights. Dom De : nicx @.> Envoyé : lundi 6 novembre 2023 09:28 À : home-assistant/core @.> Cc : Dominique GEORGES @.>; Comment @.> Objet : Re: [home-assistant/core] Automations referencing non-existing devices do not work after installing 2023.11.0b1 (Issue #102937) I tried to fix my automations, but without success. Even if I create a completely new automation with the same “defect” device (in my car a boolean helper" I get the error Message malformed: Unknown device ‘9be0a67d8c258d0681570bfd575a41ed’ How could I fix it? Do I have to delete all affected devices? That is not an option 😦 I replaced all device triggers with state triggers. Which so far works fine. In my case its not the trigger but the action… I enable/disable a boolean helper. And this boolean helper is the problematic device. — Reply to this email directly, view it on GitHub<#102937 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AO7ESMPEAFKW5GZLA3C4WKLYDCNQFAVCNFSM6AAAAAA6TPYVJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJUGMYDEMBRGM. You are receiving this because you commented.Message ID: @.@.>>

For me adding the device again in the broken automations using the UI (faulty devices leave empty fields) and save it again (sometimes need to add a bogus trigger/action to get option to save) did do the trick. But I just had a handful of broken automations because most automations use entity states. Only some older automations I did setup in beginning use device triggers or states. In future will change these also to entity states.

I tried to fix my automations, but without success. Even if I create a completely new automation with the same “defect” device (in my car a boolean helper" I get the error Message malformed: Unknown device '9be0a67d8c258d0681570bfd575a41ed'

How could I fix it? Do I have to delete all affected devices? That is not an option 😦

I replaced all device triggers with state triggers. Which so far works fine.

From my perspective, replacing #102766 with a warning in the UI and later reapplying #102766 in an upcoming release would be the way to go. It took me a few hours to fix all my automations and this is something you’d like to plan.

@smarthomefamilyverrips

Because after 2023.11 it the automation got checked on faulty device ID’s and if found any error automatically disable your automation until you repair the error, that is why it stops working… but you already had the faults also before 2023.11 only HA did not take any actions at that point 😉

I got little confused here. Do you mean to say that this behavior is by-design, and so not be fixed? (Then we can close this issue as well, right?)

I did some more digging. I pulled my oldest backup (unfortunately I only keep them for 7 days) and it looks like the “new” device IDs were already present back in core 2023.10.3 based on what I’m seeing in the device registry.

So the change of device IDs happened at some point in the past, but the automations are only now failing due to the PR @mib1185 listed.

So ultimately, at least for my case, this doesn’t look like it was caused by this beta - but this beta is bringing the issue to the surface. Which is a good thing, but is probably worth calling out as there will likely be more people that discover they have broken automations after this version.

Is exactly this, because of the PR done now we actually are able to see that a automation is not correct, so personally I think we should see this as a plus. Better just solve the broken automations instead of waiting for a “fix” that actually still not makes your automations be as should be.

Also as already mentioned instead of device triggers just use entity state triggers in case you want to prevent having to make some repairs in case future device ID changes occur. 😀

@Campagne8758 please could you verify, if the as “unknown device” alerted devices are really back (with this exact id) after the rollback?

For me, the device IDs remained unchanged which makes sense, as 2023.11 does not change device IDs, just adds some extra checks for device IDs.

I have downgraded to 2023.10 and everything is fine. Now I’m looking forward to a fix in order to upgrade Any news about this issue on 2023.11.1? I don’t see they talk about a solution in this new release.

I updated this morning so I assume I went from 2023.10.3 to 2023.11.1 since it was released yesterday. I experienced the same issue with automations not working: fortunately I always make a full backup and was able to go back to 2023.10.3 and the issue has gone away. I would say this isn’t fixed in 2023.11.1 then.

Same problem as many others with automations becoming unavailable. Has anybody tried just deleting the broken automations and the recreating them?

We lost the link between “Human readable IDs” (entity_id and device_id). Yo don’t have a full list of all the unreachable devices, neither a full description of all the automation scripts you made and optimized.

By example,I am not able to create a group of lights and use that group to turn_on and/or turn_off the “lights_group”.

Thanks to God, I was just recreating my homeautomation, I don’t how I would have reacted if everything was under HA control.

One thing is certain, before the next update, I will do a FULL BACKUP, and not only the backup proposed by the update UI.

Where, by whom are all the “non-regression tests” done ? If they tests had been passed, this version would never have corrupted so many installations.

Same problem as many others with automations becoming unavailable. Has anybody tried just deleting the broken automations and the recreating them?

I also don’t understand why they did this, if I write automation by hand, why are unreadable uids not tedious and where can I get them? Okay, you would have made them for those automations that are created through the ui, but you also ruined everything for me with .yaml! Thank you, this is user friendly, keep it up! (ง’̀-'́)ง

Tuya lightbulbs automations are broken as well

found this via google after updated to 11.0b2 (from 10.5). Agree with iridris to add the callout just so others are aware, but my biggest issue is not so much updating automations, but knowing what has changed and where all it exists. For example, an automation that uses the device condition or trigger will fail the validation and let me know, so that’s easy enough to find and replace. But automations which are triggered by zwave events also use those device ids, but are not called out. So I’m not entirely sure what my best options are for updating that stuff short of going through my house and pressing all of my zwave switches to trigger various automations to see if they still work.

Do these missing device ids show up in a log somewhere from the zwave event trigger? Or any other suggestions for the best way to make sure I have everything updated and can sleep easy knowing things will work the next time I try to use them?

This happened also for a device under the Hue Integration.