kubernetes: kubeadm 1.6.0 (only 1.6.0!!) is broken due to unconfigured CNI making kubelet NotReady
Initial report in https://github.com/kubernetes/kubeadm/issues/212.
I suspect that this was introduced in https://github.com/kubernetes/kubernetes/pull/43474.
What is going on (all on single master):
- kubeadm starts configures a kubelet and uses static pods to configure a control plane
- kubeadm creates node object and waits for kubelet to join and be ready
- kubelet is never ready and so kubeadm waits forever
In the conditions list for the node:
Ready False Wed, 29 Mar 2017 15:54:04 +0000 Wed, 29 Mar 2017 15:32:33 +0000 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Previous behavior was for the kubelet to join the cluster even with unconfigured CNI. The user will then typically run a DaemonSet with host networking to bootstrap CNI on all nodes. The fact that the node never joins means that, fundamentally, DaemonSets cannot be used to bootstrap CNI.
Edit by @mikedanese: please test patched debian amd64 kubeadm https://github.com/kubernetes/kubernetes/issues/43815#issuecomment-290616036 with fix
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 5
- Comments: 211 (116 by maintainers)
Commits related to this issue
- WIP: Initial support for 1.6 1.6 final is out, but there's still issues with kubeadm (https://github.com/kubernetes/kubernetes/issues/43815), this has a patched version just for testing. Still to do... — committed to kensimon/aws-quickstart by deleted user 7 years ago
- WIP: Initial support for 1.6 1.6 final is out, but there's still issues with kubeadm (https://github.com/kubernetes/kubernetes/issues/43815), this has a patched version just for testing. Still to do... — committed to kensimon/aws-quickstart by deleted user 7 years ago
- Merge pull request #43835 from mikedanese/kubeadm-fix Automatic merge from submit-queue don't wait for first kubelet to be ready and drop dummy deploy Per https://github.com/kubernetes/kubernetes/i... — committed to kubernetes/kubernetes by deleted user 7 years ago
- Merge pull request #43837 from mikedanese/automated-cherry-pick-of-#43835-release-1.6 Automatic merge from submit-queue Automated cherry pick of #43835 release 1.6 Automated cherry pick of #43835 r... — committed to kubernetes/kubernetes by deleted user 7 years ago
- vagrant: update to patched version of kubeadm 1.6.0 Fix for: https://github.com/kubernetes/kubernetes/issues/43815 from: https://github.com/kensimon/aws-quickstart/commit/9ae07f8d9de29c6cbca4624a61e7... — committed to obnoxxx/gluster-kubernetes by obnoxxx 7 years ago
- vagrant: update to patched version of kubeadm 1.6.0 Fix for: https://github.com/kubernetes/kubernetes/issues/43815 from: https://github.com/kensimon/aws-quickstart/commit/9ae07f8d9de29c6cbca4624a61e7... — committed to obnoxxx/gluster-kubernetes by obnoxxx 7 years ago
- ATTEMPT: vagrant: update to patched version of kubeadm 1.6.0 Contains fix for: https://github.com/kubernetes/kubernetes/issues/43815 Signed-off-by: Michael Adam <obnox@redhat.com> — committed to obnoxxx/gluster-kubernetes by obnoxxx 7 years ago
- Merge pull request #43837 from mikedanese/automated-cherry-pick-of-#43835-release-1.6 Automatic merge from submit-queue Automated cherry pick of #43835 release 1.6 Automated cherry pick of #43835 r... — committed to mintzhao/kubernetes by deleted user 7 years ago
I’m trying to install kubernetes with kubeadm on Ubuntu 16.04. Is there a quick fix for this?
this is what i did
kubeadm reset
remove ENV entries from:
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
reload systemd and kube services
systemctl daemon-reload systemctl restart kubelet.service
re-run init
kubeadm init
“broken out of the box”, words learned today.
I am really surprised that kubernetes development community has not provided any ETA for an official fix. I mean this is a horrible bug which should be easily get caught during the code testing. Since it has not, at the very least, 1.6.1 should be pushed asap with the fix so people would stop hacking their clusters and start doing productive things 😉. Am I wrong here?
Any chance to build the same fix for Centos as well? Our gating system mostly uses centos for kubernetes cluster base. If I have centos version I can guarantee approx. 100 runs of kubeadm init a day as testing.
@apsinha Are you aware of this thread? It might be good to have some product folks following, as I think there will be some important takeaways for the future.
Off the top of my head:
No disrespect intended to all the great people that make Kubernetes what it is. I’m just hoping there are some “teachable moments” here moving forward, as this looks bad in terms of the public perception of Kubernetes being reliable/stable. (Granted kubeadm is alpha/beta, but it’s still got lots of visibility.)
@overip you need to edit /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_EXTRA_ARGS
remove $KUBELET_NETWORK_ARGS
and then restart kubelet after that kubeadm init should work.
I successfully setup my Kubernetes cluster on centos-release-7-3.1611.el7.centos.x86_64 by taking the following steps (I assume Docker is already installed):
All the above steps are a result of combining suggestions from various issues around Kubernetes-1.6.0, especially kubeadm.
Hope this will save your time.
Do we have anytime line as to when this fixed will be ported to the CentOS repository ?
+1 to resolve it today as lots of efforts are wasted on dealing with collateral from the workaround.
@srzjulio you need to update RBAC rules, we used these to get us going:
apiVersion: rbac.authorization.k8s.io/v1alpha1 kind: ClusterRoleBinding metadata: name: cluster-admin roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects:
This doesn’t seem to be resolved (Ubuntu LTS, kubeadm 1.6.1).
First, I also experienced kubeadm hanging on “Created API client, waiting for the control plane to become ready” when using
--apiserver-advertise-address
flag… The journal logs say:If I don’t provide this flag, kubeadm passes, but even then I get following error for kubelet starting:
Kubelet refuses to properly start, and I cannot connect to cluster with
kubectl
in any wayGuys what is the status quo for the fix? is it gonna move to the stable repository anytime soon?
1.6.1 is out.
v1.6.1 is in the process of being released. It will be done by EOD.
Not sure if it is generally useful, but I have an ansible playbook that does all the steps for CentOS 7
https://github.com/sjenning/kubeadm-playbook
YMMV, but it at least documents the process. I also do a few things like switch docker to use json-file logging and overlay storage.
Might be useful as a reference even if you don’t actually run the playbook.
On Tue, Apr 4, 2017 at 12:55 PM, Dave Cowden notifications@github.com wrote:
I imagine your weave isn’t deploying properly because you are using the pre-1.6 yaml file.
Try “kubectl apply -f https://git.io/weave-kube-1.6”
On Tue, Apr 4, 2017 at 12:24 PM, Bo Stone notifications@github.com wrote:
On a side node, you can put back the $KUBELET_NETWORK_ARGS, after the init on the master passes. I actually did not remove it on the machine I joined, only the cgroup-driver, otherwise kubelet and docker won’t work together.
But you don’t have to kubeadm reset, just change /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and do the systemctl dance:
systemctl daemon-reload systemctl restart kubelet.service
we worked around it by removing KUBELET_NETWORK_ARGS from kubelet command line. after that kubeadm init worked fine and we were able to install canal cni plugin.
@jbeda if you have a patched version happy to test it…
Be careful – The binding that @sbezverk has there is essentially turning off RBAC. You will have a super insecure cluster if you do that.
Here’s what is seemingly working for me with the
unstable
repo (only tested the master itself):This does spit out
error: taint "dedicated:" not found
at one point, but it seems to carry on regardless.For anyone still trying the temporary fix by removing the kubelet KUBELET_NETWORK_ARGS config line, @jc1arke found a simpler workaround - have two sessions to the new master and, whilst waiting for the first node to become ready, apply a node-network config in the second session: First session after running kubeadmin init:
Second session (using Calico. Your choice may of course vary):
Back to first session:
I suggest that we drop both the node ready and the dummy deployment check altogether for 1.6 and move them to a validation phase for 1.7.
On Mar 29, 2017 10:13 AM, “Dan Williams” notifications@github.com wrote:
All correct, and while we’re at it
If you see this: kubelet: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: “cgroupfs” is different from docker cgroup driver: “systemd”
you have to edit your /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and add the flag --cgroup-driver=“systemd”
and do as above
kuebadm reset systemctl daemon-reload systemctl restart kubelet.service kubeadm init.
TL;DR;
The error message
is NOT necessarily bad.
That error message tells you that you have to plugin in a third party CNI spec implementation provider.
What is CNI and how does integrate with Kubernetes?
CNI stands for Container Network Interface and defines a specification that kubelet uses for creating a network for the cluster. See this page for more information how Kubernetes uses the CNI spec to create a network for the cluster.
Kubernetes doesn’t care how the network is created as long as it satisfies the CNI spec.
kubelet
is in charge of connecting new Pods to the network (can be an overlay network for instance).kubelet
reads a configuration directory (often/etc/cni/net.d
) for CNI networks to use. When a new Pod is created, the kubelet reads files in the configuration directory,exec
’s out to the CNI binary specified in the config file (the binary is often in/opt/cni/bin
). The binary that will be executed belongs to and is installed by a third-party (like Weave, Flannel, Calico, etc.).kubeadm
is a generic tool to spin up Kubernetes clusters and does not know what networking solution you want and doesn’t favor anyone specific. Afterkubeadm init
is run, there is no such CNI binary nor configuration. This means thekubeadm init
IS NOT ENOUGH to get a fully working cluster up and running.This means, that after
kubeadm init
, the kubelet logs will saythis is very much expected. If this wasn’t the case, we would have favored a specific network provider.
So how do I “fix” this error? The next step in the kubeadm getting started guide is “Installing a Pod network”. This means,
kubectl apply
a manifest from your preferred CNI network provider.The DaemonSet will copy out the CNI binaries needed to
/opt/cni/bin
and the needed configuration to/etc/cni/net.d/
. Also it will run the actual daemon that sets up the network between the Nodes (by writing iptables rules for instance).After the CNI provider is installed, the kubelet will notice that “oh I have some information how to set up the network”, and will use the 3rd-party configuration and binaries.
And when the network is set up by the 3rd-party provider (by kubelet invoking it), the Node will mark itself
Ready
.How is this issue related to kubeadm?
Late in the v1.6 cycle, a PR was merged that changed the way kubelet reported the
Ready/NotReady
status. In earlier releases,kubelet
had always reportedReady
status, regardless of whether the CNI network was set up or not. This was actually kind of wrong, and changed to respect the CNI network status. That is,NotReady
when CNI was uninitialized andReady
when initialized.kubeadm in v1.6.0 waited wrongly for the master node to be in the
Ready
state before proceeding with the rest of thekubeadm init
tasks. When the kubelet behavior changed to reportNotReady
when CNI was uninitialized, kubeadm would wait forever for the Node to getReady
.THAT WAIT MISCONCEPTION ON THE KUBEADM SIDE IS WHAT THIS ISSUE IS ABOUT
However, we quickly fixed the regression in v1.6.1 and released it some days after v1.6.0.
Please read the retrospective for more information about this, and why v1.6.0 could be released with this flaw.
So, what do you do if you think you see this issue in kubeadm v1.6.1+?
Well, I really think you don’t. This issue is about when
kubeadm init
is deadlocking. No users or maintainers have seen that in v1.6.1+.What you WILL see though is
after every
kubeadm init
in all versions above v1.6, but that IS NOT BADAnyway, please open a new issue if you see something unexpected with kubeadm
Please do not comment more on this issue. Instead open a new one.
@billmilligan So you only have to
kubectl apply
a CNI provider’s manifest to get your cluster up and running I thinkI’m pretty much summarizing what has been said above, but hopefully in a more clear and detailed way. If you have questions about how CNI work, please refer to the normal support channels like StackOverflow, an issue or Slack.
(Lastly, sorry for that much bold text, but I felt like it was needed to get people’s attention.)
@drajen No, this affected only v1.6.0. It’s expected that kubelet doesn’t find a network since you haven’t installed any. For example, just run
to install Weave Net and those problems will go away. You can choose to install Flannel, Calico, Canal or whatever CNI network if you’d like
Can anybody tell me how to build a patched version of kubeadm for rhel (rpm) ?
@coeki I’d also add a request for N-1 versions to be kept in rpm/deb repos. All major releases eventually end up with a problem or two. Operators have long avoided N.0 releases for production for that very reason. It works well if previous versions are left around for a while. But this time 1.5.x was removed entirely before 1.6 was made stable. That puts operators that weren’t very prepared (local repo mirroring, etc) from making forward progress while the issue is sorted out. The pain of a bumpy N+1 release can often be dealt with by simply keeping N around for a while.
@mikedanese Do you have any plan to update centos yum repo ? Or is it already deployed to yum repo ?
Thanks to @luxas for wrestling my particular problem to the ground: https://github.com/kubernetes/kubeadm/issues/302
Yeah, it is might be non-obvious and we’re sorry for that, but we can’t have one single providers name there either.
Chatted with @drajen on Slack and the issue was cgroup related, the kubelet was unhealthy and wasn’t able to create any Pods, hence the issue.
@bostone thanks. i’ll downgrade to that version to see if i can get a working setup. on my system the latest is a weird 17.03.1.ce version ( evidently the latest greatest)
Ok, I did all the steps from scratch and it seems to be better. Here are the steps that worked for me thus far and I’m running as root on CentOS 7.
Add
-cgroup-driver=systemd
to 10-kubeadm.conf and saveAt this point I can run
kubectl get nodes
and see my master node in the list. Repeat all the steps for minion exceptkubeadm init
and instead runningkubeadm join --token a21234.c7abc2f82e2219fd 12.34.567.89:6443
as generated bykubeadm init
This step completes and I can see master and minion(s) nodesAnd now - the problem:
So looks like the nodes are never become ready. Any suggestions?
@bostone maybe you’re missing these steps after
kubeadm init
?You also need to follow step 3 described here. That seems related to the cni config error you’re getting.
@gtirloni with our suggestion I got to the end of
kubeadm init
however any attempt to run kubectl produces this error:The connection to the server localhost:8080 was refused - did you specify the right host or port?
I’m not sure where and how change that or what is the right port at this point?@bostone you need to adjus the .spec here.
@obnoxxx try the tip of the release-1.6 branch.
https://storage.googleapis.com/kubernetes-release-dev/ci/v1.6.1-beta.0.12+018a96913f57f9/bin/linux/amd64/kubeadm
PLEASE TEST THE PATCHED DEBS
The
kubernetes-xenial-unstable
now has a patched build1.6.1-beta.0.5+d8a384c1c5e35d-00
that @pipejakob and I have been testing today. The nodes remain not ready until a pod network is created (e.g. by applying weave/flannel configs). Conformance test pass. PTAL.cc @luxas @jbeda
So, if I understand correctly, for now, since https://github.com/kubernetes/features/issues/166 is taking some longer time to be able to taint network availability correctly, we have to go with a work around. If we can push a fix ASAP for kubeadm, like #43835, with a comment to fix this with https://github.com/kubernetes/features/issues/166, a lot ppl are going to be happy.
I can’t believe nobody really tried kubeadm 1.6.0 before 1.6.0 was released…
And, kubelet 1.5.6 + kubeadm 1.5.6 are also broken, /etc/systemd/system/kubelet.service.d/10-kubeadm.conf references /etc/kubernetes/pki/ca.crt but kubeadm doesn’t generate ca.crt, there is ca.pem although.
Currently 1.6.0 and 1.5.6 are the only left releases in k8s apt repository…
It looks like DaemonSets will still get scheduled even if the node is not ready. This is really, in this case,
kubeadm
being a little too paranoid.The current plan that we are going to test out is to have
kubeadm
no longer wait for the master node to be ready but instead just have it be registered. This should be good enough to let a CNI DaemonSet be scheduled to set up CNI.@kensimon is testing this out.