etcd: ETCD with TLS showing warning "transport: authentication handshake failed: remote error: tls: bad certificate"
I refer to the following two articles:
https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md
Initialize a certificate authority
$ cat ca-config.json
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"server": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"server auth"
]
},
"client": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
$ cat ca-csr.json
{
"CN": "My own CA",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "US",
"L": "CA",
"O": "My Company Name",
"ST": "San Francisco",
"OU": "Org Unit 1",
"OU": "Org Unit 2"
}
]
}
$ cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
Generate server certificate
# cfssl print-defaults csr > server.json
$ cat server.json
{
"CN": "etcd1",
"hosts": [
"192.168.1.221"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "US",
"L": "CA",
"ST": "San Francisco"
}
]
}
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server
Etcd Server
etcd --name infra0 --data-dir infra0 \
--client-cert-auth --trusted-ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem \
--advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
2018-05-29 11:17:10.374455 I | etcdmain: etcd Version: 3.3.5
2018-05-29 11:17:10.374527 I | etcdmain: Git SHA: 70c872620
2018-05-29 11:17:10.374534 I | etcdmain: Go Version: go1.9.6
2018-05-29 11:17:10.374540 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-29 11:17:10.374546 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-05-29 11:17:10.374859 I | embed: listening for peers on http://localhost:2380
2018-05-29 11:17:10.374899 I | embed: listening for client requests on 127.0.0.1:2379
2018-05-29 11:17:10.377043 I | etcdserver: name = infra0
2018-05-29 11:17:10.377067 I | etcdserver: data dir = infra0
2018-05-29 11:17:10.377074 I | etcdserver: member dir = infra0/member
2018-05-29 11:17:10.377079 I | etcdserver: heartbeat = 100ms
2018-05-29 11:17:10.377087 I | etcdserver: election = 1000ms
2018-05-29 11:17:10.377092 I | etcdserver: snapshot count = 100000
2018-05-29 11:17:10.377125 I | etcdserver: advertise client URLs = https://127.0.0.1:2379
2018-05-29 11:17:10.377133 I | etcdserver: initial advertise peer URLs = http://localhost:2380
2018-05-29 11:17:10.377143 I | etcdserver: initial cluster = infra0=http://localhost:2380
2018-05-29 11:17:10.379279 I | etcdserver: starting member 8e9e05c52164694d in cluster cdf818194e3a8c32
2018-05-29 11:17:10.379320 I | raft: 8e9e05c52164694d became follower at term 0
2018-05-29 11:17:10.379337 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2018-05-29 11:17:10.379344 I | raft: 8e9e05c52164694d became follower at term 1
2018-05-29 11:17:10.385248 W | auth: simple token is not cryptographically signed
2018-05-29 11:17:10.388175 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-29 11:17:10.388842 I | etcdserver: 8e9e05c52164694d as single-node; fast-forwarding 9 ticks (election ticks 10)
2018-05-29 11:17:10.389395 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2018-05-29 11:17:10.392890 I | embed: ClientTLS: cert = server.pem, key = server-key.pem, ca = , trusted-ca = ca.pem, client-cert-auth = true, crl-file =
2018-05-29 11:17:10.479773 I | raft: 8e9e05c52164694d is starting a new election at term 1
2018-05-29 11:17:10.479819 I | raft: 8e9e05c52164694d became candidate at term 2
2018-05-29 11:17:10.479887 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 2
2018-05-29 11:17:10.479906 I | raft: 8e9e05c52164694d became leader at term 2
2018-05-29 11:17:10.479915 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 2
2018-05-29 11:17:10.480540 I | etcdserver: published {Name:infra0 ClientURLs:[https://127.0.0.1:2379]} to cluster cdf818194e3a8c32
2018-05-29 11:17:10.480670 E | etcdmain: forgot to set Type=notify in systemd service file?
2018-05-29 11:17:10.480694 I | embed: ready to serve client requests
2018-05-29 11:17:10.480718 I | etcdserver: setting up the initial cluster version to 3.3
2018-05-29 11:17:10.481430 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-29 11:17:10.481638 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-29 11:17:10.532133 I | embed: serving client requests on 127.0.0.1:2379
2018-05-29 11:17:10.539294 I | embed: rejected connection from "127.0.0.1:39794" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 8
- Comments: 23 (6 by maintainers)
Links to this issue
Commits related to this issue
- pkg/certsigner/signer: Add "client" usage to server profile Avoid issues like [1]: WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication h... — committed to wking/kubecsr by wking 6 years ago
- pkg/certsigner/signer: Add "client" usage to server profile Avoid issues like [1]: WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication h... — committed to wking/kubecsr by wking 6 years ago
- pkg/certsigner/signer: Add "client" usage to server profile Avoid issues like [1]: WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication h... — committed to alaypatel07/kubecsr by wking 6 years ago
- pkg/certsigner/signer: Add "client" usage to server profile Avoid issues like [1]: WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication h... — committed to openshift/kubecsr by wking 6 years ago
- pkg/certsigner/signer: Add "client" usage to server profile Avoid issues like [1]: WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication h... — committed to openshift/kubecsr by wking 6 years ago
- pkg/certsigner/signer: Add "client" usage to server profile Avoid issues like [1]: WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication h... — committed to openshift/kubecsr by wking 6 years ago
TL;DR: How to fix the issue:
ca-config.json: add “client auth” to the “server” section
Regenerate the cert
Check server certificate: (I copied it to /etc/etcd/server.pem)
Environment vars:
Run etcd
I found this issue as I was troubleshooting issues that arose during an etcd upgrade from 3.1.x to 3.2.x using kubeadm. After some debugging I was able to determine that the new (as of etcd 3.2.x) client usage requirement of the serving certificate is due to the use of the server certificate as a client certificate for the grpc gateway.
This requirement doesn’t appear to be documented in any of the places I would expect, such as: https://coreos.com/os/docs/latest/generate-self-signed-certificates.html https://coreos.com/etcd/docs/latest/op-guide/security.html https://coreos.com/etcd/docs/latest/dev-guide/api_grpc_gateway.html https://coreos.com/etcd/docs/latest/op-guide/configuration.html https://coreos.com/etcd/docs/latest/upgrades/upgrade_3_2.html
Ideally, I would expect there to be a configuration option to specify a separate client cert for the grpc gateway (and tangentially also be able to specify separate client/server certs for the peer certificates as well).
It seems that I had a mistake with addressing the etcd from etcdctl from within the pod, I
kubectl exec ectd-cluster-0 sh
and ran etcdctl without the –cert --key and –cacert, I thought that when running from within the pod you don’t need it, but I guess you do. So it’s working now.@KIVagant How do you achieve this fix with openssl only? (Not using cfssl)
edit
Figured it out.
Use the documentation from Kubernetes here: https://kubernetes.io/docs/concepts/cluster-administration/certificates/
You want to utilize the
v3_ext
config at the bottom when you are signing your csr with your CA. Note that this is part of thex509
command, not thereq
command.@hexfusion I agree. My confusion is why etcd server needs client auth.
I ran into this as well. Adding client usage fixed it. I agree that there should be an option for separate client cert for this purpose instead of hijacking the server certificate for this purpose!
@mindcrime first of all check if the cluster works. I believe there is a big difference between working cluster when something external tries to connect to the port and when cluster’s nodes really can’t join. In my case all nodes operate normal and I can get members info and put messages.
When I set the
--client-cert-auth
parameter tofalse
, the warning was gone. So I guess the etcd process will do a health check as a client.