kairos: k3s fails to start on Raspberry Pi
Kairos version: NAME=“openSUSE Leap” VERSION=“15.5” ID=“opensuse-leap” ID_LIKE=“suse opensuse” VERSION_ID=“15.5” PRETTY_NAME=“openSUSE Leap 15.5” ANSI_COLOR=“0;32” CPE_NAME=“cpe:/o:opensuse:leap:15.5” BUG_REPORT_URL=“https://bugs.opensuse.org” HOME_URL=“https://www.opensuse.org/” DOCUMENTATION_URL=“https://en.opensuse.org/Portal:Leap” LOGO=“distributor-logo-Leap” KAIROS_NAME=“kairos-opensuse-leap-arm-rpi” KAIROS_VERSION=“v2.3.0-k3sv1.27.3+k3s1” KAIROS_ID=“kairos” KAIROS_ID_LIKE=“kairos-opensuse-leap-arm-rpi” KAIROS_VERSION_ID=“v2.3.0-k3sv1.27.3+k3s1” KAIROS_PRETTY_NAME=“kairos-opensuse-leap-arm-rpi v2.3.0-k3sv1.27.3+k3s1” KAIROS_BUG_REPORT_URL=“https://github.com/kairos-io/kairos/issues/new/choose” KAIROS_HOME_URL=“https://github.com/kairos-io/provider-kairos” KAIROS_IMAGE_REPO=“quay.io/kairos/kairos-opensuse-leap-arm-rpi” KAIROS_IMAGE_LABEL=“latest” KAIROS_GITHUB_REPO=“kairos-io/provider-kairos” KAIROS_VARIANT=“kairos”
CPU architecture, OS, and Version: Linux yak-001 5.14.21-150500.53-default #1 SMP PREEMPT_DYNAMIC Wed May 10 07:56:26 UTC 2023 (b630043) aarch64 aarch64 aarch64 GNU/Linux
Describe the bug
K3s fails to start on Raspberry PI (have tried on Ubuntu and OpenSUSE based images). Due to an error writing to /var/lib/rancher/k3s/data
To Reproduce
- Install an extracted or custom created Raspberry Pi image on to a SD card
- Enable k3s in a cloud-init file
- Power on the Raspberry Pi
Expected behavior K3s should start successfully
Logs
sudo systemctl status k3s shows the service is stuck in the activating state, continually in an auto-restart loop:
● k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/k3s.service.d
└─override.conf
Active: activating (auto-restart) (Result: exit-code) since Tue 2023-07-11 02:25:08 UTC; 776ms ago
Docs: https://k3s.io
Process: 2096 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
Process: 2098 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 2099 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Process: 2100 ExecStart=/usr/bin/k3s server (code=exited, status=1/FAILURE)
Main PID: 2100 (code=exited, status=1/FAILURE)
sudo journalctl -u k3s shows a failure loop due to being unable to extract data into var/lib/rancher/k3s/data:
systemd[1]: Starting Lightweight Kubernetes...
sh[3096]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
sh[3097]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
k3s[3100]: time="2023-07-11T02:34:23Z" level=info msg="Acquiring lock file /var/lib/rancher/k3s/data/.lock"
k3s[3100]: time="2023-07-11T02:34:23Z" level=info msg="Preparing data dir /var/lib/rancher/k3s/data/8147fdcc81517672a3573345f56cc1fc8eb>
k3s[3100]: time="2023-07-11T02:34:23Z" level=info msg="error extracting tarball into /var/lib/rancher/k3s/data/8147fdcc81517672a3573345>
k3s[3100]: time="2023-07-11T02:34:24Z" level=fatal msg="extracting data: error writing to /var/lib/rancher/k3s/data/8147fdcc81517672a35>
systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: k3s.service: Failed with result 'exit-code'.
systemd[1]: Failed to start Lightweight Kubernetes.
systemd[1]: k3s.service: Scheduled restart job, restart counter is at 139.
systemd[1]: Stopped Lightweight Kubernetes.
systemd[1]: Starting Lightweight Kubernetes...
Note: I tried this previously in Kairos 2.2.1 images for both openSUSE and Ubuntu. I wanted to try it on the latest release today before filing to make sure it wasn’t already addressed.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 41 (29 by maintainers)
This is expected as you are checking the version of kairos-agent.
For the Kairos version you need to check /etc/os-release
Kairos-agent and Kairos have different release cadence, currently they are kind of matched in version numbers due to coincidence, but you could have different versions altogether depending on their release cadence, so it could report version 2.2.12 of the agent in the Kairos version 2.1 for example 😃
Could be that alpine is missing a required binary on the initramfs needed to expand.
@mauromorales this should now be fixed on master with model rpi4, if ther reason was not enough space under persistent.
Until this is fixed, you can run the command manually (annoying but only needs to be done once)
rd.break=initqueuesgdisk -g -d=4 -n=4:21635072:+0 -c=4:Linux\ filesystem -t=4:8300 /dev/mmcblk0-Pto dry run before running it withoutOn next boot you should see something like
I think the issue might be related to the size of the persistent partition: