kubeadm: how to renew the certificate when apiserver cert expired?
Is this a request for help?
If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.
If no, delete this section and continue on.
What keywords did you search in kubeadm issues before filing this one?
If you have found any duplicates, you should instead reply there and close this page.
If you have not found any duplicates, delete this section and continue on.
Is this a BUG REPORT or FEATURE REQUEST?
Choose one: BUG REPORT or FEATURE REQUEST
Versions
kubeadm version (use kubeadm version):1.7.5
Environment:
- Kubernetes version (use
kubectl version):1.7.5 - Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a): - Others:
What happened?
What you expected to happen?
How to reproduce it (as minimally and precisely as possible)?
Anything else we need to know?
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 51 (7 by maintainers)
Links to this issue
Commits related to this issue
- https://github.com/kubernetes/kubeadm/issues/581#issuecomment-1117133068 — committed to alvistack/ansible-role-kube_master by hswong3i 9 months ago
If you are using a version of kubeadm prior to 1.8, where I understand certificate rotation #206 was put into place (as a beta feature) or your certs already expired, then you will need to manually update your certs (or recreate your cluster which it appears some (not just @kachkaev) end up resorting to).
You will need to SSH into your master node. If you are using kubeadm >= 1.8 skip to 2.
There is an important note here. If you are on AWS, you will need to explicitly pass the
--node-nameparameter in this request. Otherwise you will get an error like:Unable to register node "ip-10-0-8-141.ec2.internal" with API server: nodes "ip-10-0-8-141.ec2.internal" is forbidden: node ip-10-0-8-141 cannot modify node ip-10-0-8-141.ec2.internalin your logssudo journalctl -u kubelet --all | tailand the Master Node will report that it isNot Readywhen you runkubectl get nodes.Please be certain to replace the values passed in
--apiserver-advertise-addressand--node-namewith the correct values for your environment.kubectlis looking in the right place for your config files.If you do not have a valid token. You can create one with:
The token should look something like 6dihyb.d09sbgae8ph2atjw
Hopefully this gets you where you need to be @davidcomeyne.
I had to deal with this also on a 1.13 cluster, in my case the certificates were about to expire so slightly different Also dealing with a single master\control instance on premise so did not have to worry about a HA setup or AWS specifics Have not included the back steps as the other guys have included above
Since the certs had not expired, the cluster already had workloads which I wanted to continue working Did not have to deal with etcd certs either at this time so have omitted
So at a high level I had to
Lets create a new token for nodes re-joining the cluster (After kubelet restart)
Now for each worker - one at a time
ssh to worker node
Back to master and uncordon the worker
After all workers have been updated - Remove token - will expire in 24h but lets get rid of it
Thanks @danroliver putting together the steps. I had to make small additions to your steps. My cluster is running v1.9.3 and it is in a private datacenter off of the Internet.
On the Master
config.yml.--config config.ymllike this:On the minions
I had to move
For Kubernetes v1.14 I find this procedure proposed by @desdic the most helpful:
admin.conf:Just a comment and feature request: This cert expiration hit us in production on our Kubernetes 1.11.x cluster this morning. We tried everything above (and to links), but hit numerous errors, gave up after a few hours getting completely stuck with a large hosed cluster. Fortunately, we were about 2 weeks away from upgrading to Kubernetes 1.15 (and building a new cluster) so we ended up just creating a new 1.15 cluster from scratch and copying over all our user data.
I very much wish there had been some warning before this happened. We just went from “incredibly stable cluster” to “completely broken hellish nightmare” without any warning, and had probably our worst downtime ever. Fortunately, it was a west coast Friday afternoon, so relatively minimally impactful.
Of everything discussed above and in all the linked tickets, the one thing that would have made a massive difference for us isn’t mentioned: start displaying a warning when certs are going to expire soon. (E.g., if you use kubectl, and the cert is going to expire within a few weeks, please tell me!).
in 1.15 we have added better documentation for certificate renewal: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
also, after 1.15
kubeadm upgradeautomatically will renewal the certificates for you!versions older than 1.13 are already unsupported. we strongly encourage the users to keep up with this fast moving project.
currently there are discussions going on by the LongTermSupport Working Group, to have versions of kubernetes being supported for longer periods of time, but establishing the process might take a while.
For anyone that stumbles upon this in the future, which are running a newer version of kubernetes >1.17, this is probably the simplest way to renew your certs.
The following renews all certs, restarts kubelet, takes a backup of the old admin config and applies the new admin config:
@zalmanzhao did you manage to solve this issue?
I created a kubeadm
v1.9.3cluster just over a year ago and it was working fine all this time. I went to update one deployment today and realised I was locked out of the API because the cert got expired. I can’t evenkubeadm alpha phase certs apiserver, because I getfailure loading apiserver certificate: the certificate has expired(kubeadm version is currently1.10.6since I want to upgrade).Adding
insecure-skip-tls-verify: trueto~/.kube/config→clusters[0].cluserdoes not help too – I seeYou must be logged in to the server (Unauthorized)when trying tokubectl get pods(https://github.com/kubernetes/kubernetes/issues/39767).The cluster is working, but it lives its own life until it self-destroys or until things get fixed 😅 Unfortunately, I could not find a solution for my situation in #206 and am wondering how to get out of it. The only relevant material I could dig out was a blog post called ‘How to change expired certificates in kubernetes cluster’, which looked promising at first glance. However, it did not fit in the end because my master machine did not have
/etc/kubernetes/ssl/folder (only/etc/kubernetes/pki/) – either I have a different k8s version or I simply deleted that folder without noticing.@errordeveloper could you please recommend something? I’d love to fix things without
kubeadm resetand payload recreation.Thanks @kachkaev for responding. I will nonetheless give it another try. If I find something I will make sure to post it here…
For k8s 1.15 ~ 1.18, this may be helpful: https://zhuanlan.zhihu.com/p/382605009
high version use:
kubeadm alpha certs renew all
A k8s cluster created using
kubeadmv1.9.x experienced the same issue (apiserver-kubelet-client.crtexpired on 2 July) at the age ofv1.14.1lolI had to refer to 4 different sources to renew the certificates, regenerate the configuration files and bring the simple 3 node cluster back.
@danroliver gave very good and structured instructions, very close to the below guide from IBM. [Renewing Kubernetes cluster certificates] from IBM WoW! (https://www.ibm.com/support/knowledgecenter/en/SSCKRH_1.1.0/platform/t_certificate_renewal.html)
Problem with step 3 and step 5
Step 3 should NOT have the phase in the command
Step 5 should be using below,
kubeadm alphadoes not have kubeconfig all, that is a kubeadm init phase insteadThis is what I need only for 1.14.2 … any hints on how to
I know this issue is closed but I have the same problem on 1.14.2 and the guide gives no errors but I cannot connect to the cluster and reissue the token (I get auth failed)
@danroliver: Thank you very much, it’s working.
It’s not necessary to reboot the servers. It’s enought to recreate kube-system pods (apiserver, schduler, …) by these two commands:
systemctl restart kubelet for i in $(docker ps | egrep ‘admin|controller|scheduler|api|fron|proxy’ | rev | awk ‘{print $1}’ | rev); do docker stop $i; done
Sorry for your troubles. Normally it is the responsibility of the operator to monitor the certs on disk for expiration. But i do agree that the lack of easy monitoring can cause trouble. That is one of the reasons we added a command to check cert expiration in kubeadm. See https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
Also please note that after 1.15 kubeadm will auto renew certificates on upgrade. Which encourages the users to upgrade more often too. On Jul 20, 2019 03:49, “William Stein” notifications@github.com wrote:
@danroliver Thanks! Just tried it on an old single-node cluster, so did steps up to 7. It worked.
For this case, you may still need to ssh into these 3 master node and update the certificates by providing commands cause each master node have their individual api server.
simplest way to update your k8s certs
@anapsix I’m running a 1.13.x cluster, and apiserver is reporting
Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid]after I renewed the certs by runningkubeadm alpha certs renew all.Which token are you referring to in this case? Is the one generated by kubeadm or how can I delete the token ?
-----UPDATE----- I figured out it’s the secret itself. In my case the kube-controller was not up so the secret was not auto-generated.
Note about tokens in K8s
1.13.x(possibly other K8s versions) If you’ve ended up re-generating your CA certificate (/etc/kubernetes/pki/ca.crt), your tokens (seekubectl -n kube-system get secret | grep token) might have old CA, and will have to be regenerated. Troubled tokens includedkube-proxy-token,coredns-tokenin my case (and others), which caused cluster-critical services to unable to authenticate with K8s API. To regenerate tokens, delete old ones, and they will be recreated. Same goes for any services talking to K8s API, such as PV Provisioner, Ingress Controllers,cert-manager, etc…thank you @danroliver . it works for me and my kubeadm version is 1.8.5
@danroliver Worked for me. Thank you.
Thanks a bunch @danroliver ! I will definitely try that and post my findings here.