core: Crashing everyday

#home assistant crashing everyday# The problem

Hass.io keeps crashing everyday,I can see it on my router. Have disconnect the power to boot it back up. Its only started doing it since the last few updates.

How can I get the logs to find the issue

Environment

  • Home Assistant Core release with the issue:
  • Last working Home Assistant Core release (if known):
  • Operating environment (OS/Container/Supervised/Core):
  • Integration causing this issue:
  • Link to integration documentation on our website:

Problem-relevant configuration.yaml


Traceback/Error logs


Additional information

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 143 (15 by maintainers)

Most upvoted comments

@pataar I switched to MariaDB early on in my testing to try to see if it made any difference but it didn’t. help. Still crashed several times a day.

There’s another link of people experiencing the same issue here https://github.com/home-assistant/operating-system/issues/1232. It seems there are many people experiencing the same or similar issue on RPIs. There are many disjointed records in all these different threads. Even this thread has been closed with out any real solution provided and yet people are continuing to contribute.

I think HA is an awesome product and the result of a lot of hard work, by a lot of people, over a lot of years. It is a shame to see it’s reputation brought down by this issue, especially as an RPI is a cheap and easy base hardware for home automation.

I feel if this issue is ever going to be resolved adequately there needs to be a coordinate approach to defining the problem and capturing the problem details in a central location. There is a lot of information scattered amongst these various threads but it is hard to see the big picture when it is so disjointed. Some advice from others as to the best way to collate this data would be great. The spreadsheet @HumanSkunk has linked is a start ie So far the issue seems to be:

  • with RPIs 3 and 4
  • running OS 5.5 and above
  • SD or SSD doesn’t seem to matter
  • It does not appear to be a power problem
  • 64 & 32 Bit are both affected …

There also needs to be someone or group on the development team who is willing to work with us users to try to resolve this issue and provide possible solutions or fixes to try. Without this developer buy-in we can’t make any progress.

I am more than willing to contribute in anyway I can and have done some of this already previously in this thread.

@muzzak123 following your idea, I re-installed from scratch a 5.5 using https://github.com/home-assistant/operating-system/releases/tag/5.5

So far, running for 12 hours without crashes…

Hi, @CarlosGS I agree. From everything I have seen on my system this particular bug is not making it to disk logs due to the crash. I have rolled my system back to HASSOS 5.4 and it has been running for over 34hrs without missing a beat. From reading comments here https://github.com/home-assistant/operating-system/issues/1119 and my own experience I’m convinced the problem I’m having has been introduced by changes made to HASSOS since 5.4. Perhaps instead of trying to “catch the problem” in phantom logs that may or may not exist, may I suggest a full review of the changes made since HASSOS 5.4, focussing in particular on those that may affect a RPI ?

Oh and @Toukite you can access an RPI via the Putty program using SSH, without a HDMI screen, but obviously with another computer on the network. Putty can also log all activity so the full logs are captured. Although I used this process many times I never had any joy capturing anything meaningful at the time of the crash.

Same here. RPi 4. Operating System: Home Assistant OS 5.9, Home Assistant 2020.12.1. Power Supply is a genuine RPi 4 Plug Pack. Ever since last upgrade it became unstable and kept crashing. I thought it might be a faulty SD card so moved to an SSD but it didn’t fix the issue. When it crashes I can’t get to it from the network. Won’t even ping. The only way to fix it seems to be pull the plug and restart and then it all works for a while.

OK - in case anyone is wanting to try this, you just need to rollback HASSOS to 5.4. Everything else remains the same. You can do this by logging on as root and issuing the command ha os update --version 5.4

Hi @CarlosGS. Thanks for your reply and I appreciate your time and help with this. Re Addons, I tried uninstalling many of them to see if it stopped the crashing issue, but it didn’t seem to make any difference, so I put them back.

Addons Screen Addons Note: Frigate NVR, Google Assistant Webserver, SSH & Web Terminal and Terminal & SSH are installed but not started

Integrations Screen Integrations

Supervisor Screen Supervisor

System Screen CPU I included this screen to show what is happening with the system at crash time. I had a crash today at about 8:30 and you can notice the flat period where there are no values for about an hour. There doesn’t seem to be any spike in activity in the system prior to this crash.

The setup

Same problem here. Pi3b+ 64 bit HAS latest version

I’ve the same problem

I think no one currently understands the issue. Many open github issues with Pi lockups and nothing is close to being resolved. A spreadsheet was add to 1119 issue on the freezing and many users added their configurations. Nothing seems out of the ordinary. Most users are not seeing any memory / CPU issues. Just freezing after several hours to a couple of days. I can run my identical setup on OS 5.4 and it has never failed.

Google Cast was the issue for me. Disabled it, and I’ve been stable for 2 days, when I was crashing every hour or so.

@muzzak123 thanks for summarizing the situation, I do agree with your assessment. The problem is with this kind of problems, if none of the core devs hit it, it usually takes longer and more coordination to get it fixed 😦

There also needs to be someone or group on the development team who is willing to work with us users to try to resolve this issue and provide possible solutions or fixes to try.

I am tracking the problem off and on, and I am trying to help isolate the problem in various issues.

Please let’s move the discussion to the appropriate issues in the OS repository, such as https://github.com/home-assistant/operating-system/issues/1119.

@CarlosGS fully agreed. I was just reacting to previous frenck answers, which were more like “your need is bs I don’t wanna hear it and it will never be considered” Than “okay maybe there is something to do here, we’ll see if it’s feasible and how to plan it. In the meantime, i can help differently etc…” Anyway, that point being made, no need to escalate 😃

Wow - it is somewhat reassuring to know I’m not alone. It does seem quite wide spread and the timeline on https://github.com/home-assistant/operating-system/issues/1119 seems to fit with my issue occurring around the 2020.12.0 update. I’m wondering if there is someway to consolidate all these threads so others can input to help resolve it as well ? Seems they are recommending a downgrade to HassOS 5.4 as a temporary fix. I might try that.

The ESPHome is interesting. I only have one ESP32-WROOM physical device. In HA I disabled 2 gpios that I had configured in ESPHOME but am not using and it seems the integration has split it into 2 devices. gpio The top one contains the disabled entities and the bottom contains the enabled entities. Yaml is here gpio.txt

@frenck yes, mean add-on, sorry for messing with the terminology, I am new to the project. It makes things clear to me that I was in a separate container after logging in. For now, I do not understand, why run SSH in a separate container, rendering it essentially useless until filesystem parts are explicitly shared with the host (and logs would be one of the first things to share, or why at all use SSH?). But I will read the docs first rather than asking stupid things here in the comments. Just did not think I would run into such a trouble right out of the box with my default installation. I will try to get the logs from the host filesystem and keep posted, thank you for your help.

Prior to a HASS.OS patch late last year, I used to see the console on the screen. After that, all I’d see would be the login prompt. I have no idea why that changed. The console just stopped displaying.

it’s kind of all academic now anyway. It seems to be rock solid on the NUC. No crashes. I have saved that link for the future should ever I need it.

@frenck thanks for suggestion, and pity that systemd logs are not available to users via HTTP frontend and SSH plugin (why? I set up root access via plugin but did not check whoami though). I was sure that there were no logs retained, thought about bothering to set up remote logging with syslogd. Will set up true SSH access or mount and check my SD card when get back home.

Log are there and multiple ways are already given on how to get them. We need logs for this issue report, without it, there is nothing we can do.

See my previous comment on HDMI.

At this point, I dunno what crashed. The Home Assistant logs, for example, are in the configuration folder. For OS issues, you could check on debug console of anything is visible (or maybe even displayed on the HDMI port).