rke: Failed to get /health for host - remote error: tls: bad certificate
Getting Failed to get /health for host - remote error: tls: bad certificate when trying to upgrade an existing cluster. No modification to certificates have been done.
RKE version:
rke version v0.2.1
Docker version:
Client:
Version: 18.06.3-ce
API version: 1.38
Go version: go1.10.3
Git commit: d7080c1
Built: Wed Feb 20 02:27:18 2019
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.06.3-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: d7080c1
Built: Wed Feb 20 02:26:20 2019
OS/Arch: linux/amd64
Experimental: false
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
16.04.4 LTS (Xenial Xerus) 4.4.0-116-generic
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) ESXi Virtual Machine
cluster.yml file:
nodes:
- address: 10.10.7.121
user: daniel
role: [controlplane,worker,etcd]
- address: 10.10.7.122
user: daniel
role: [controlplane,worker,etcd]
- address: 10.10.7.123
user: daniel
role: [controlplane,worker,etcd]
services:
etcd:
snapshot: true
creation: 6h
retention: 24h
Steps to Reproduce:
./rke -d up
Results:
...
DEBU[0028] [remove/rke-log-linker] Container doesn't exist on host [10.10.7.123]
DEBU[0028] [etcd] Checking image [rancher/rke-tools:v0.1.27] on host [10.10.7.123]
DEBU[0028] Checking if image [rancher/rke-tools:v0.1.27] exists on host [10.10.7.123]
DEBU[0028] Image [rancher/rke-tools:v0.1.27] exists on host [10.10.7.123]
DEBU[0028] [etcd] No pull necessary, image [rancher/rke-tools:v0.1.27] exists on host [10.10.7.123]
INFO[0029] [etcd] Successfully started [rke-log-linker] container on host [10.10.7.123]
DEBU[0029] [remove/rke-log-linker] Checking if container is running on host [10.10.7.123]
DEBU[0029] [remove/rke-log-linker] Removing container on host [10.10.7.123]
INFO[0029] [remove/rke-log-linker] Successfully removed container on host [10.10.7.123]
DEBU[0029] [etcd] Successfully created log link for Container [etcd] on host [10.10.7.123]
INFO[0029] [etcd] Successfully started etcd plane.. Checking etcd cluster health
DEBU[0029] [etcd] Check etcd cluster health
DEBU[0029] Failed to get /health for host [10.10.7.121]: Get https://10.10.7.121:2379/health: remote error: tls: bad certificate
DEBU[0034] Failed to get /health for host [10.10.7.121]: Get https://10.10.7.121:2379/health: remote error: tls: bad certificate
DEBU[0039] Failed to get /health for host [10.10.7.121]: Get https://10.10.7.121:2379/health: remote error: tls: bad certificate
DEBU[0044] [etcd] Check etcd cluster health
DEBU[0045] Failed to get /health for host [10.10.7.122]: Get https://10.10.7.122:2379/health: remote error: tls: bad certificate
DEBU[0050] Failed to get /health for host [10.10.7.122]: Get https://10.10.7.122:2379/health: remote error: tls: bad certificate
DEBU[0055] Failed to get /health for host [10.10.7.122]: Get https://10.10.7.122:2379/health: remote error: tls: bad certificate
DEBU[0060] [etcd] Check etcd cluster health
DEBU[0060] Failed to get /health for host [10.10.7.123]: Get https://10.10.7.123:2379/health: remote error: tls: bad certificate
DEBU[0065] Failed to get /health for host [10.10.7.123]: Get https://10.10.7.123:2379/health: remote error: tls: bad certificate
DEBU[0070] Failed to get /health for host [10.10.7.123]: Get https://10.10.7.123:2379/health: remote error: tls: bad certificate
FATA[0075] [etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthy
```
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 28 (6 by maintainers)
Per conversation with @Oats87 have now reproduced a cause for this error upon attempted upgrade of cluster via RKE v0.2.0 or v0.2.1.
If the
kube_config_<file>.ymlfile is absent from the local directory when you performrke upRKE treats the cluster as new rather than a legacy cluster, which will result in the[etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthyfatal error, with debug messages of the formatFailed to get /health for host [10.10.7.123]: Get https://10.10.7.123:2379/health: remote error: tls: bad certificate.Reproducer
rke upusing RKE v0.1.7kube_config_<file>.ymlfilerke -d upusing RKE v0.2.0 or v0.2.1[etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthyerror with/health: remote error: tls: bad certificatemessages.Workaround Upon encountering this issue as a result of the missing
kube_config_<file>.ymlduring upgrade, the following workaround can be used:Getting same Issue on CentOS Linux release 8.3.2011, Docker 19.03.5, rke version v1.2.5.
WARN[0212] [etcd] host [rke1.###.net] failed to check etcd health: failed to get /health for host [rke1.###.###]: Get "https://rke1.###.net:2379/health": remote error: tls: bad certificate WARN[0306] [etcd] host [rke2.###.net] failed to check etcd health: failed to get /health for host [rke2.###.###]: Get "https://rke2.###.net:2379/health": remote error: tls: bad certificate WARN[0399] [etcd] host [rke3.###.net] failed to check etcd health: failed to get /health for host [rke3.###.###]: Get "https://rke3.###.net:2379/health": remote error: tls: bad certificate FATA[0399] [etcd] Failed to bring up Etcd Plane: etcd cluster is unhealthy: hosts [rke1.###.net,rke2.###.net,rke3.###.net] failed to report healthy. Check etcd container logs on each host for more informationI tried the workaround mentioned by @axeal, but didn’t help in my case…
I had the same issue using version v1.0.4 but my problem was solved by @axeal’s answer. However, after deleting my
.rkestateI got this error and had to recreate it. This script might be handy if someone needs to recreate it.@axeal the workaround is missing the additional step of “Remove your
kube_config_<file>.ymlfile” at the beginning, so that when you run therke upwith0.1.xRKE re-generates a validkube_config_<file>.yml