kubernetes: kubeadm init --kubernetes-version=v1.8.0 fail with connection refuse for Get http://localhost:10255/healthz

Is this a BUG REPORT or FEATURE REQUEST?:

BUG REPORT

What happened: kubeadm init for initial cluster in version 1.8 fail with error:

[kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.

What you expected to happen: kubeadm should be initial cluster normally

How to reproduce it (as minimally and precisely as possible):

Lab Description: VMName: Roles: IP Address: kubernetes-ms Master 10.38.14.200 kubernetes-1 Worker 10.38.14.201 kubernetes-2 Worker 10.38.14.202

  1. Setup docker/kubectl/kubeadm/kubelet with version 1.8.0 for all node
  2. On “kubernetes-ms” start command:

“kubeadm init --kubernetes-version=v1.8.0 --pod-network-cidr=10.244.0.0/16 --token 8c2350.f55343444a6ffc46”

Result: [kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [init] Using Kubernetes version: v1.8.0 [init] Using Authorization modes: [Node RBAC] [preflight] Running pre-flight checks [preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.09.0-ce. Max validated version: 17.03 [preflight] WARNING: Running with swap on is not supported. Please disable swap or set kubelet’s --fail-swap-on flag to false. [kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0) [certificates] Generated ca certificate and key. [certificates] Generated apiserver certificate and key. [certificates] apiserver serving cert is signed for DNS names [kubernetes-ms kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.38.14.200] [certificates] Generated apiserver-kubelet-client certificate and key. [certificates] Generated sa key and public key. [certificates] Generated front-proxy-ca certificate and key. [certificates] Generated front-proxy-client certificate and key. [certificates] Valid certificates and keys now exist in “/etc/kubernetes/pki” [kubeconfig] Wrote KubeConfig file to disk: “admin.conf” [kubeconfig] Wrote KubeConfig file to disk: “kubelet.conf” [kubeconfig] Wrote KubeConfig file to disk: “controller-manager.conf” [kubeconfig] Wrote KubeConfig file to disk: “scheduler.conf” [controlplane] Wrote Static Pod manifest for component kube-apiserver to “/etc/kubernetes/manifests/kube-apiserver.yaml” [controlplane] Wrote Static Pod manifest for component kube-controller-manager to “/etc/kubernetes/manifests/kube-controller-manager.yaml” [controlplane] Wrote Static Pod manifest for component kube-scheduler to “/etc/kubernetes/manifests/kube-scheduler.yaml” [etcd] Wrote Static Pod manifest for a local etcd instance to “/etc/kubernetes/manifests/etcd.yaml” [init] Waiting for the kubelet to boot up the control plane as Static Pods from directory “/etc/kubernetes/manifests” [init] This often takes around a minute; or longer if the control plane images have to be pulled. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz/syncloop’ failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz/syncloop’ failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz/syncloop’ failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn’t running or healthy. [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused … … … Unfortunately, an error has occurred: timed out waiting for the condition

This error is likely caused by that: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) - There is no internet connection; so the kubelet can’t pull the following control plane images: - gcr.io/google_containers/kube-apiserver-amd64:v1.8.0 - gcr.io/google_containers/kube-controller-manager-amd64:v1.8.0 - gcr.io/google_containers/kube-scheduler-amd64:v1.8.0

   You can troubleshoot this for example with the following commands if you're on a systemd-powered system:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'

couldn’t initialize a Kubernetes cluster

Anything else we need to know?: As we install back to kubernetes version 1.7.0 this not happen and we can initial cluster normally, As try K8S on google cloud this still work smooth

Environment:

  • Kubernetes version (use kubectl version): 1.8.0

  • Cloud provider or hardware configuration**: On-Premise with VMWare ESXi 5.5 (VM Guest OS: ubuntu 16.04)

  • OS (e.g. from /etc/os-release): Ubuntu 16.04

  • Kernel (e.g. uname -a): Linux kubernetes-ms 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12 14:59:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools:

  • Others:

@k8s-sig-onprem-bugs

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 69 (25 by maintainers)

Most upvoted comments

@praparn As of release Kubernetes 1.8.0, kubelet will not work with enabled swap.

You have two choices: either disable swap or add to kubelet flag to continue working with enabled swap. Something like this:

# echo 'Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false"' > /etc/systemd/system/kubelet.service.d/90-local-extras.conf
# systemctl daemon-reload
# systemctl restart kubelet

@kad you are correct, after configured CNI plugin, My cluster works fine now.

Log: root@kubenode:# ku get no NAME STATUS ROLES AGE VERSION kubenode Ready master 1h v1.8.0 kubenode2 Ready <none> 1h v1.8.0

======== To summarise:

delete the nodes if you have.

  • kubeadm reset

  • add “Environment=“KUBELET_EXTRA_ARGS=–fail-swap-on=false”” to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

  • systemctl daemon-reload

  • systemctl restart kubelet

  • kubeadm init

This works for me. if the status is NotReady, try to get the log of kubelet to see if it is CNI plugin issue.

==== I realized that add “–skip-preflight-checks” may be a bad idea. It is a useful way to locate the issue.

Had the same issue; worked for me after disabling the swap completely.

  1. turned off the swap (swapoff -a)
  2. kubeadm reset
  3. kubeadm init (success!!)

We have successfully prevented this bug by turning off the swap on Ubuntu 16.04 amd64

$ iptables -F
$ swapoff -a
$ free -m
$ kubeadm reset
$ kubeadm init --kubernetes-version v1.8.0 --pod-network-cidr=10.244.0.0/16

@kad kubeadm reset

add “Environment=“KUBELET_EXTRA_ARGS=–fail-swap-on=false”” to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

systemctl daemon-reload

systemctl restart kubelet

kubeadm init --skip-preflight-checks

I am having this same issue with swap disabled beforehand. It appears that the api server is never started, so all the calls to it fail.

@sbidaye82 Ok. now the reason is clear.

Error which we were looking for is here:

Oct 26 07:27:30 salil-master kubelet[28086]: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

Please edit your /etc/systemd/system/kubelet.service.d/90-local-extras.conf file to have in it:

[Service]
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false"

then do systemctl daemon-reload to reload service files, and then try to start init process again.

This has been bothering me as well. Can someone point to a url or document that explains the reasoning or background behind the decision to make swap off a requirement. I’ve seen this question asked in slack as well with no answers.

Disable the swap, follow the sequence.

kubeadm reset systemctl restart kubelet kubeadm init --skip-preflight-checks

root@kubenode:~# ku get no NAME STATUS ROLES AGE VERSION kubenode NotReady master 6m v1.8.0

Get a not ready cluster,

New issue: failed on “Container runtime network not ready: NetworkReady=false”

@kad hope is eternal 😃

@kad I try all method which you suggest,but output the same mistake

it’s a chicken-egg situation … kubelet can’t start before kubeadm init generates a bunch of stuff into the /etc/kubernetes folder (eg pki certs) … and the health checks in kubeadm init require kubelet to be running … however, IF you try starting kubelet before kubadm init, it will initially fail, but then go into a crashloop … and if you are lucky timing-wise, kubelet may have started up (right after kubeadm init generates its files/certs) and before the health-checks run … pretty messy - I would think the solution would be to have kubeadm init start kubelet before the health checks run… this issue should not be closed

i also meet this question and i try many methords,but it can not solutied

@kad

It worked. Appreciate your pointers. Apparently KUBELET_CGROUP_ARGS was not part of ExecStart - hence was not reflecting on init. After adding it in ExecStart it worked fine.

@praparn Have you noticed warning “[preflight] WARNING: Running with swap on is not supported. Please disable swap or set kubelet’s --fail-swap-on flag to false.” ?

Did you disable swap or did you modified kubelet parameters ?