etcd: Unable to launch v3.1.0-alpha.1 with full client cert on mac el capitan

Using the latest etcd-v3.1.0-alpha.1 binaries on OS X El Capitan, I am unable to launch a full etcd client cert cluster using the same cert setup that worked throughout 2.x -> 3.0.x (full client and server certs).

When I applied the patch for X onto v3.1.0-alpha.0, I was able to launch with the following config. The certs I am using have both DNS and IP sections for local IPs, so my initial suspicion was that something around SNI name selection or gRPC host -> tls config was changed, or that something around how etcd expects to receive client certs changed.

$ curl -L https://github.com/coreos/etcd/releases/download/v3.1.0-alpha.1/etcd-v3.1.0-alpha.1-darwin-amd64.zip -o etcd-v3.1.0-alpha.1-darwin-amd64.zip
$ unzip etcd-v3.1.0-alpha.1-darwin-amd64.zip
$ etcd-v3.1.0-alpha.1-darwin-amd64/etcd --listen-peer-urls=https://0.0.0.0:7001 --listen-client-urls=https://0.0.0.0:4001  --advertise-client-urls=https://192.168.1.103:4001 --cert-file openshift.local.config/master/etcd.server.crt --key-file openshift.local.config/master/etcd.server.key --peer-cert-file openshift.local.config/master/etcd.server.crt  --peer-key-file openshift.local.config/master/etcd.server.key --initial-advertise-peer-urls https://192.168.1.103:7001 --initial-cluster=default=https://192.168.1.103:7001 --peer-client-cert-auth --client-cert-auth
2016-10-01 14:58:14.133430 I | etcdmain: etcd Version: 3.1.0-alpha.1
2016-10-01 14:58:14.133519 I | etcdmain: Git SHA: 2469a95
2016-10-01 14:58:14.133522 I | etcdmain: Go Version: go1.7.1
2016-10-01 14:58:14.133528 I | etcdmain: Go OS/Arch: darwin/amd64
2016-10-01 14:58:14.133531 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
2016-10-01 14:58:14.133539 W | etcdmain: no data-dir provided, using default data-dir ./default.etcd
2016-10-01 14:58:14.133569 I | embed: peerTLS: cert = openshift.local.config/master/etcd.server.crt, key = openshift.local.config/master/etcd.server.key, ca = , trusted-ca = , client-cert-auth = true
2016-10-01 14:58:14.134151 I | embed: listening for peers on https://0.0.0.0:7001
2016-10-01 14:58:14.134189 I | embed: listening for client requests on 0.0.0.0:4001
2016-10-01 14:58:14.136236 I | etcdserver: name = default
2016-10-01 14:58:14.136246 I | etcdserver: data dir = default.etcd
2016-10-01 14:58:14.136250 I | etcdserver: member dir = default.etcd/member
2016-10-01 14:58:14.136253 I | etcdserver: heartbeat = 100ms
2016-10-01 14:58:14.136256 I | etcdserver: election = 1000ms
2016-10-01 14:58:14.136259 I | etcdserver: snapshot count = 10000
2016-10-01 14:58:14.136265 I | etcdserver: advertise client URLs = https://192.168.1.103:4001
2016-10-01 14:58:14.136269 I | etcdserver: initial advertise peer URLs = https://192.168.1.103:7001
2016-10-01 14:58:14.136274 I | etcdserver: initial cluster = default=https://192.168.1.103:7001
2016-10-01 14:58:14.241248 I | etcdserver: starting member 3092679e8c56a1a5 in cluster e989df3141e943e1
2016-10-01 14:58:14.241294 I | raft: 3092679e8c56a1a5 became follower at term 0
2016-10-01 14:58:14.241314 I | raft: newRaft 3092679e8c56a1a5 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2016-10-01 14:58:14.241324 I | raft: 3092679e8c56a1a5 became follower at term 1
2016-10-01 14:58:14.244579 I | etcdserver: starting server... [version: 3.1.0-alpha.1, cluster version: to_be_decided]
2016-10-01 14:58:14.244607 I | embed: ClientTLS: cert = openshift.local.config/master/etcd.server.crt, key = openshift.local.config/master/etcd.server.key, ca = , trusted-ca = , client-cert-auth = true
2016-10-01 14:58:14.244889 E | etcdserver: cannot monitor file descriptor usage (cannot get FDUsage on darwin)
2016-10-01 14:58:14.245190 I | membership: added member 3092679e8c56a1a5 [https://192.168.1.103:7001] to cluster e989df3141e943e1
2016-10-01 14:58:14.643319 I | raft: 3092679e8c56a1a5 is starting a new election at term 1
2016-10-01 14:58:14.643470 I | raft: 3092679e8c56a1a5 became candidate at term 2
2016-10-01 14:58:14.643482 I | raft: 3092679e8c56a1a5 received vote from 3092679e8c56a1a5 at term 2
2016-10-01 14:58:14.643498 I | raft: 3092679e8c56a1a5 became leader at term 2
2016-10-01 14:58:14.643509 I | raft: raft.node: 3092679e8c56a1a5 elected leader 3092679e8c56a1a5 at term 2
2016-10-01 14:58:14.643746 I | etcdserver: setting up the initial cluster version to 3.1
2016-10-01 14:58:14.648828 N | membership: set the initial cluster version to 3.1
2016-10-01 14:58:14.648873 I | etcdserver: published {Name:default ClientURLs:[https://192.168.1.103:4001]} to cluster e989df3141e943e1
2016-10-01 14:58:14.648901 I | embed: ready to serve client requests
2016-10-01 14:58:14.648933 I | api: enabled capabilities for version 3.1
2016-10-01 14:58:14.649463 I | embed: serving client requests on [::]:4001
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.

If I remove --peer-client-cert-auth --client-cert-auth then the server starts (so I suspect the wrong cert is being presented). Is this a change to how etcd expects client certs + server certs to be presented?

Attached are the certs. The master.etcd-client.{key,cert} files are the ones we use for clients to access with. certs.zip

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 19 (16 by maintainers)

Most upvoted comments

This was fixed in later v3.1.0-rc.0 - thanks!

Good news, it looks like it is fixed with v3.1.0-rc.0 and OSX. Doing some more testing to be absolutely sure.