etcd: ETCD with TLS showing warning "transport: authentication handshake failed: remote error: tls: bad certificate"

I refer to the following two articles:

https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md

Initialize a certificate authority

$ cat ca-config.json
{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "server": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "server auth"
        ]
      },
      "client": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "client auth"
        ]
      },
      "peer": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "server auth",
          "client auth"
        ]
      }
    }
  }
}

$ cat ca-csr.json
{
  "CN": "My own CA",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "US",
      "L": "CA",
      "O": "My Company Name",
      "ST": "San Francisco",
      "OU": "Org Unit 1",
      "OU": "Org Unit 2"
    }
  ]
}

$ cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

Generate server certificate

# cfssl print-defaults csr > server.json
$ cat server.json
{
  "CN": "etcd1",
  "hosts": [
    "192.168.1.221"
  ],
  "key": {
    "algo": "ecdsa",
    "size": 256
  },
  "names": [
    {
        "C": "US",
        "L": "CA",
        "ST": "San Francisco"
    }
  ]
}

$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server

Etcd Server

etcd --name infra0 --data-dir infra0 \
  --client-cert-auth --trusted-ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem \
  --advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
2018-05-29 11:17:10.374455 I | etcdmain: etcd Version: 3.3.5
2018-05-29 11:17:10.374527 I | etcdmain: Git SHA: 70c872620
2018-05-29 11:17:10.374534 I | etcdmain: Go Version: go1.9.6
2018-05-29 11:17:10.374540 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-29 11:17:10.374546 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-05-29 11:17:10.374859 I | embed: listening for peers on http://localhost:2380
2018-05-29 11:17:10.374899 I | embed: listening for client requests on 127.0.0.1:2379
2018-05-29 11:17:10.377043 I | etcdserver: name = infra0
2018-05-29 11:17:10.377067 I | etcdserver: data dir = infra0
2018-05-29 11:17:10.377074 I | etcdserver: member dir = infra0/member
2018-05-29 11:17:10.377079 I | etcdserver: heartbeat = 100ms
2018-05-29 11:17:10.377087 I | etcdserver: election = 1000ms
2018-05-29 11:17:10.377092 I | etcdserver: snapshot count = 100000
2018-05-29 11:17:10.377125 I | etcdserver: advertise client URLs = https://127.0.0.1:2379
2018-05-29 11:17:10.377133 I | etcdserver: initial advertise peer URLs = http://localhost:2380
2018-05-29 11:17:10.377143 I | etcdserver: initial cluster = infra0=http://localhost:2380
2018-05-29 11:17:10.379279 I | etcdserver: starting member 8e9e05c52164694d in cluster cdf818194e3a8c32
2018-05-29 11:17:10.379320 I | raft: 8e9e05c52164694d became follower at term 0
2018-05-29 11:17:10.379337 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2018-05-29 11:17:10.379344 I | raft: 8e9e05c52164694d became follower at term 1
2018-05-29 11:17:10.385248 W | auth: simple token is not cryptographically signed
2018-05-29 11:17:10.388175 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-29 11:17:10.388842 I | etcdserver: 8e9e05c52164694d as single-node; fast-forwarding 9 ticks (election ticks 10)
2018-05-29 11:17:10.389395 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2018-05-29 11:17:10.392890 I | embed: ClientTLS: cert = server.pem, key = server-key.pem, ca = , trusted-ca = ca.pem, client-cert-auth = true, crl-file = 
2018-05-29 11:17:10.479773 I | raft: 8e9e05c52164694d is starting a new election at term 1
2018-05-29 11:17:10.479819 I | raft: 8e9e05c52164694d became candidate at term 2
2018-05-29 11:17:10.479887 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 2
2018-05-29 11:17:10.479906 I | raft: 8e9e05c52164694d became leader at term 2
2018-05-29 11:17:10.479915 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 2
2018-05-29 11:17:10.480540 I | etcdserver: published {Name:infra0 ClientURLs:[https://127.0.0.1:2379]} to cluster cdf818194e3a8c32
2018-05-29 11:17:10.480670 E | etcdmain: forgot to set Type=notify in systemd service file?
2018-05-29 11:17:10.480694 I | embed: ready to serve client requests
2018-05-29 11:17:10.480718 I | etcdserver: setting up the initial cluster version to 3.3
2018-05-29 11:17:10.481430 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-29 11:17:10.481638 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-29 11:17:10.532133 I | embed: serving client requests on 127.0.0.1:2379
2018-05-29 11:17:10.539294 I | embed: rejected connection from "127.0.0.1:39794" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 8
  • Comments: 23 (6 by maintainers)

Commits related to this issue

Most upvoted comments

TL;DR: How to fix the issue:

ca-config.json: add “client auth” to the “server” section

{
    "signing": {
        "default": {
            "expiry": "1000000h"
        },
        "profiles": {
            "server": {
                "expiry": "1000000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            },
            "client": {
                "expiry": "1000000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "peer": {
                "expiry": "43800h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}

Regenerate the cert

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server

Check server certificate: (I copied it to /etc/etcd/server.pem)

$ openssl x509 -in /etc/etcd/server.pem -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
...
    Signature Algorithm: sha256WithRSAEncryption
...
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE

Environment vars:

ETCD_CLIENT_CERT_AUTH=true
ETCD_KEY_FILE=/etc/etcd/server-key.pem
ETCD_CERT_FILE=/etc/etcd/server.pem
ETCD_TRUSTED_CA_FILE=/etc/etcd/ca.pem
...

Run etcd

sudo etcd --peer-auto-tls=true
...

I found this issue as I was troubleshooting issues that arose during an etcd upgrade from 3.1.x to 3.2.x using kubeadm. After some debugging I was able to determine that the new (as of etcd 3.2.x) client usage requirement of the serving certificate is due to the use of the server certificate as a client certificate for the grpc gateway.

This requirement doesn’t appear to be documented in any of the places I would expect, such as: https://coreos.com/os/docs/latest/generate-self-signed-certificates.html https://coreos.com/etcd/docs/latest/op-guide/security.html https://coreos.com/etcd/docs/latest/dev-guide/api_grpc_gateway.html https://coreos.com/etcd/docs/latest/op-guide/configuration.html https://coreos.com/etcd/docs/latest/upgrades/upgrade_3_2.html

Ideally, I would expect there to be a configuration option to specify a separate client cert for the grpc gateway (and tangentially also be able to specify separate client/server certs for the peer certificates as well).

It seems that I had a mistake with addressing the etcd from etcdctl from within the pod, I kubectl exec ectd-cluster-0 sh and ran etcdctl without the –cert --key and –cacert, I thought that when running from within the pod you don’t need it, but I guess you do. So it’s working now.

@KIVagant How do you achieve this fix with openssl only? (Not using cfssl)

edit

Figured it out.

Use the documentation from Kubernetes here: https://kubernetes.io/docs/concepts/cluster-administration/certificates/

You want to utilize the v3_ext config at the bottom when you are signing your csr with your CA. Note that this is part of the x509 command, not the req command.

@hexfusion I agree. My confusion is why etcd server needs client auth.

I ran into this as well. Adding client usage fixed it. I agree that there should be an option for separate client cert for this purpose instead of hijacking the server certificate for this purpose!

@mindcrime first of all check if the cluster works. I believe there is a big difference between working cluster when something external tries to connect to the port and when cluster’s nodes really can’t join. In my case all nodes operate normal and I can get members info and put messages.

When I set the --client-cert-auth parameter to false, the warning was gone. So I guess the etcd process will do a health check as a client.

# server auth & --client-cert-auth=false
$ etcd --name infra0 --data-dir infra0 \
  --client-cert-auth=false --trusted-ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem \
  --advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
2018-05-30 11:43:23.150450 I | etcdmain: etcd Version: 3.3.5
2018-05-30 11:43:23.150561 I | etcdmain: Git SHA: 70c872620
2018-05-30 11:43:23.150577 I | etcdmain: Go Version: go1.9.6
2018-05-30 11:43:23.150590 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-30 11:43:23.150602 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-05-30 11:43:23.150699 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2018-05-30 11:43:23.151409 I | embed: listening for peers on http://localhost:2380
2018-05-30 11:43:23.151494 I | embed: listening for client requests on 127.0.0.1:2379
2018-05-30 11:43:23.152450 I | etcdserver: name = infra0
2018-05-30 11:43:23.152471 I | etcdserver: data dir = infra0
2018-05-30 11:43:23.152484 I | etcdserver: member dir = infra0/member
2018-05-30 11:43:23.152496 I | etcdserver: heartbeat = 100ms
2018-05-30 11:43:23.152516 I | etcdserver: election = 1000ms
2018-05-30 11:43:23.152529 I | etcdserver: snapshot count = 100000
2018-05-30 11:43:23.152550 I | etcdserver: advertise client URLs = https://127.0.0.1:2379
2018-05-30 11:43:23.153964 I | etcdserver: restarting member 8e9e05c52164694d in cluster cdf818194e3a8c32 at commit index 14
2018-05-30 11:43:23.154047 I | raft: 8e9e05c52164694d became follower at term 7
2018-05-30 11:43:23.154074 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 7, commit: 14, applied: 0, lastindex: 14, lastterm: 7]
2018-05-30 11:43:23.158976 W | auth: simple token is not cryptographically signed
2018-05-30 11:43:23.161144 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-30 11:43:23.162710 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2018-05-30 11:43:23.163138 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-30 11:43:23.163261 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-30 11:43:23.165712 I | embed: ClientTLS: cert = server.pem, key = server-key.pem, ca = , trusted-ca = ca.pem, client-cert-auth = false, crl-file = 
2018-05-30 11:43:25.054746 I | raft: 8e9e05c52164694d is starting a new election at term 7
2018-05-30 11:43:25.054839 I | raft: 8e9e05c52164694d became candidate at term 8
2018-05-30 11:43:25.054875 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 8
2018-05-30 11:43:25.054908 I | raft: 8e9e05c52164694d became leader at term 8
2018-05-30 11:43:25.054930 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 8
2018-05-30 11:43:25.056827 I | etcdserver: published {Name:infra0 ClientURLs:[https://127.0.0.1:2379]} to cluster cdf818194e3a8c32
2018-05-30 11:43:25.056909 I | embed: ready to serve client requests
2018-05-30 11:43:25.057110 E | etcdmain: forgot to set Type=notify in systemd service file?
2018-05-30 11:43:25.113424 I | embed: serving client requests on 127.0.0.1:2379