rancher: Rancher crashes with the error [FATAL] k3s exited with: exit status 255
What kind of request is this (question/bug/enhancement/feature request): bug
Steps to reproduce (least amount of steps as possible):
-
run Rancher:master-head ec04f78, docker install
-
add a cluster using ec2
k8s 1.19.2-rancher1-1 -
do some operations in the cluster, like installing/uninstalling monitoring v2
Result:
- Rancher crashes with the following error in its logos
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2020-09-28 22:35:50.392320 W | etcdserver: read-only range request "key:\"/registry/apiextensions.k8s.io/customresourcedefinitions\" range_end:\"/registry/apiextensions.k8s.io/customresourcedefinitiont\" count_only:true " with result "error:context canceled" took too long (1.438297145s) to execute
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2020-09-28 22:35:50.392448 W | etcdserver: read-only range request "key:\"/registry/configmaps/fleet-system/gitjob\" " with result "error:context canceled" took too long (1.214651514s) to execute
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2020-09-28 22:35:50.392494 W | etcdserver: read-only range request "key:\"/registry/namespaces/kube-system\" " with result "error:context canceled" took too long (1.248120392s) to execute
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2020-09-28 22:35:50.392539 W | etcdserver: read-only range request "key:\"/registry/controllers\" range_end:\"/registry/controllert\" count_only:true " with result "error:context canceled" took too long (1.26574346s) to execute
WARNING: 2020/09/28 22:35:50 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
W0928 22:35:50.397496 6 reflector.go:425] pkg/mod/github.com/rancher/client-go@v1.19.0-rancher.1/tools/cache/reflector.go:157: watch of *summary.SummarizedObject ended with: very short watch: pkg/mod/github.com/rancher/client-go@v1.19.0-rancher.1/tools/cache/reflector.go:157: Unexpected watch close - watch lasted less than a second and no items received
...
...
W0928 22:35:50.404731 6 reflector.go:425] pkg/mod/github.com/rancher/client-go@v1.19.0-rancher.1/tools/cache/reflector.go:157: watch of *summary.SummarizedObject ended with: very short watch: pkg/mod/github.com/rancher/client-go@v1.19.0-rancher.1/tools/cache/reflector.go:157: Unexpected watch close - watch lasted less than a second and no items received
2020/09/28 22:35:50 [FATAL] k3s exited with: exit status 255
2020/09/28 22:35:53 [INFO] Rancher version ec04f7878 (ec04f7878) is starting
Here is the full logs of the setup logs.txt
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 3
- Comments: 29 (4 by maintainers)
I had same issue on Debian Bullseye with 5.10.70 kernel. Stranger, if you still looking how to resolve that issue then try next steps (it works for me, with latest
rancher/rancher):Edit the GRUB config:
sudo nano /etc/default/grubThen set/append to
GRUB_CMDLINE_LINUXvariablecgroup_memory=1 cgroup_enable=memory swapaccount=1 systemd.unified_cgroup_hierarchy=0GRUB_CMDLINE_LINUX="cgroup_memory=1 cgroup_enable=memory swapaccount=1 systemd.unified_cgroup_hierarchy=0"^ (I didn’t test with any single option, so maybe some options are not necessary)
Save the file. Then update GRUB:
sudo update-grubAnd reboot.
sudo rebootIt’s now June and NO ONE at Rancher has a clue what to do about their product crashing on restart? All did was restart the VM, there is NO reason why this should happen. 😦
I have the same problem
stable does not work for me either
Just updated to Pop!_OS 21.10 then rancher do not work anymore. I was getting a lot of
k3s exitsandcannot connect to http://120.0.0.1:6443errors. Reinstating rancher failed too.Seemed related …
PopOs is using
kernelstubnotgrubso thanks to that guy I fixed it with :I am unable to run rancher/rancher:latest on Ubuntu 20.4 LTS - on Digitalocean.
I was able to run rancher/server:stable the v1.6 version - but half the Digitalocean integrations don’t work. output.log
Switching to the “stable” version worked for me.
I hit the fatal error in another rancher:master-head, ec04f78 docker install setup the only thing I did in the setup was to provision and delete an RKE cluster
one crash
and another one
it’s friend, the year is 2022 and I’m going through the same.
I’m getting same error using (2.6)
rancher:latest,rancher:stable,rancher:v2.5.16. I do NOT get this error usingrancher:v2.4.18. I’m using clean Ubuntu Server 22.04 VM on ProxMox, Docker 20.10.17 installed using modified version number in Rancher docker install script. 8G RAM, 4CPUs, 100 GB disk.same here, using command on a vagrant ubuntu/focal64 box:
docker run -d --privileged --name rancher-server --restart=unless-stopped -p 8080:80 -p 8443:443 -e CATTLE_BOOTSTRAP_PASSWORD=XXX-v /opt/rancher:/var/lib/rancher rancher/rancher:v2.6.0Last lines of log:
Had the same issue and v2.6.0-rc10 worked fine
I’m receiving this error with all versions of rancher v2.5+ using the stock startup script (fresh install)
i’ve tried:
sudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/ranchersudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:stablesudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:latestsudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:v2.5.0sudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:v2.5.1sudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:v2.5.9…etc. all of them fail out of the box. v2.4.17 seems to start up fine
EDIT: v2.6-head works for me as well
[exit status 255] How to adjust etcd parameters (heartbeat-interval election-timeout)?