kubernetes: kubeadm init --kubernetes-version=v1.8.0 fail with connection refuse for Get http://localhost:10255/healthz
Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT
What happened: kubeadm init for initial cluster in version 1.8 fail with error:
[kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
What you expected to happen: kubeadm should be initial cluster normally
How to reproduce it (as minimally and precisely as possible):
Lab Description: VMName: Roles: IP Address: kubernetes-ms Master 10.38.14.200 kubernetes-1 Worker 10.38.14.201 kubernetes-2 Worker 10.38.14.202
- Setup docker/kubectl/kubeadm/kubelet with version 1.8.0 for all node
- On “kubernetes-ms” start command:
“kubeadm init --kubernetes-version=v1.8.0 --pod-network-cidr=10.244.0.0/16 --token 8c2350.f55343444a6ffc46”
Result: [kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [init] Using Kubernetes version: v1.8.0 [init] Using Authorization modes: [Node RBAC] [preflight] Running pre-flight checks [preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.09.0-ce. Max validated version: 17.03 [preflight] WARNING: Running with swap on is not supported. Please disable swap or set kubelet’s --fail-swap-on flag to false. [kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0) [certificates] Generated ca certificate and key. [certificates] Generated apiserver certificate and key. [certificates] apiserver serving cert is signed for DNS names [kubernetes-ms kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.38.14.200] [certificates] Generated apiserver-kubelet-client certificate and key. [certificates] Generated sa key and public key. [certificates] Generated front-proxy-ca certificate and key. [certificates] Generated front-proxy-client certificate and key. [certificates] Valid certificates and keys now exist in “/etc/kubernetes/pki” [kubeconfig] Wrote KubeConfig file to disk: “admin.conf” [kubeconfig] Wrote KubeConfig file to disk: “kubelet.conf” [kubeconfig] Wrote KubeConfig file to disk: “controller-manager.conf” [kubeconfig] Wrote KubeConfig file to disk: “scheduler.conf” [controlplane] Wrote Static Pod manifest for component kube-apiserver to “/etc/kubernetes/manifests/kube-apiserver.yaml” [controlplane] Wrote Static Pod manifest for component kube-controller-manager to “/etc/kubernetes/manifests/kube-controller-manager.yaml” [controlplane] Wrote Static Pod manifest for component kube-scheduler to “/etc/kubernetes/manifests/kube-scheduler.yaml” [etcd] Wrote Static Pod manifest for a local etcd instance to “/etc/kubernetes/manifests/etcd.yaml” [init] Waiting for the kubelet to boot up the control plane as Static Pods from directory “/etc/kubernetes/manifests” [init] This often takes around a minute; or longer if the control plane images have to be pulled. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz/syncloop’ failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz/syncloop’ failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz/syncloop’ failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused … … … Unfortunately, an error has occurred: timed out waiting for the condition
This error is likely caused by that: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) - There is no internet connection; so the kubelet can’t pull the following control plane images: - gcr.io/google_containers/kube-apiserver-amd64:v1.8.0 - gcr.io/google_containers/kube-controller-manager-amd64:v1.8.0 - gcr.io/google_containers/kube-scheduler-amd64:v1.8.0
You can troubleshoot this for example with the following commands if you're on a systemd-powered system:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
couldn’t initialize a Kubernetes cluster
Anything else we need to know?: As we install back to kubernetes version 1.7.0 this not happen and we can initial cluster normally, As try K8S on google cloud this still work smooth
Environment:
-
Kubernetes version (use
kubectl version
): 1.8.0 -
Cloud provider or hardware configuration**: On-Premise with VMWare ESXi 5.5 (VM Guest OS: ubuntu 16.04)
-
OS (e.g. from /etc/os-release): Ubuntu 16.04
-
Kernel (e.g.
uname -a
): Linux kubernetes-ms 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12 14:59:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux -
Install tools:
-
Others:
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 69 (25 by maintainers)
@praparn As of release Kubernetes 1.8.0, kubelet will not work with enabled swap.
You have two choices: either disable swap or add to kubelet flag to continue working with enabled swap. Something like this:
@kad you are correct, after configured CNI plugin, My cluster works fine now.
Log: root@kubenode:# ku get no NAME STATUS ROLES AGE VERSION kubenode Ready master 1h v1.8.0 kubenode2 Ready <none> 1h v1.8.0
======== To summarise:
delete the nodes if you have.
kubeadm reset
add “Environment=“KUBELET_EXTRA_ARGS=–fail-swap-on=false”” to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
systemctl restart kubelet
kubeadm init
This works for me. if the status is NotReady, try to get the log of kubelet to see if it is CNI plugin issue.
==== I realized that add “–skip-preflight-checks” may be a bad idea. It is a useful way to locate the issue.
Had the same issue; worked for me after disabling the swap completely.
We have successfully prevented this bug by turning off the swap on Ubuntu 16.04 amd64
@kad kubeadm reset
add “Environment=“KUBELET_EXTRA_ARGS=–fail-swap-on=false”” to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
systemctl restart kubelet
kubeadm init --skip-preflight-checks
I am having this same issue with swap disabled beforehand. It appears that the api server is never started, so all the calls to it fail.
@sbidaye82 Ok. now the reason is clear.
Error which we were looking for is here:
Please edit your
/etc/systemd/system/kubelet.service.d/90-local-extras.conf
file to have in it:then do
systemctl daemon-reload
to reload service files, and then try to start init process again.This has been bothering me as well. Can someone point to a url or document that explains the reasoning or background behind the decision to make swap off a requirement. I’ve seen this question asked in slack as well with no answers.
Disable the swap, follow the sequence.
kubeadm reset systemctl restart kubelet kubeadm init --skip-preflight-checks
root@kubenode:~# ku get no NAME STATUS ROLES AGE VERSION kubenode NotReady master 6m v1.8.0
Get a not ready cluster,
New issue: failed on “Container runtime network not ready: NetworkReady=false”
@kad hope is eternal 😃
@kad I try all method which you suggest,but output the same mistake
it’s a chicken-egg situation … kubelet can’t start before kubeadm init generates a bunch of stuff into the /etc/kubernetes folder (eg pki certs) … and the health checks in kubeadm init require kubelet to be running … however, IF you try starting kubelet before kubadm init, it will initially fail, but then go into a crashloop … and if you are lucky timing-wise, kubelet may have started up (right after kubeadm init generates its files/certs) and before the health-checks run … pretty messy - I would think the solution would be to have kubeadm init start kubelet before the health checks run… this issue should not be closed
i also meet this question and i try many methords,but it can not solutied
@kad
It worked. Appreciate your pointers. Apparently KUBELET_CGROUP_ARGS was not part of ExecStart - hence was not reflecting on init. After adding it in ExecStart it worked fine.
@MengZn @MarkDeckert
You may want to read this PR. https://github.com/kubernetes/kubernetes/pull/47181
@praparn Have you noticed warning “[preflight] WARNING: Running with swap on is not supported. Please disable swap or set kubelet’s --fail-swap-on flag to false.” ?
Did you disable swap or did you modified kubelet parameters ?