etcd: [etcdctl] Error: context deadline exceeded

I’m having issues using domain names to communicate with an existing cluster via etcdctl. The problem seems related to #10430, which seemed fixed in #10428.

A few info:

$ brew info etcd # provides etcdctl command
etcd: stable 3.3.12 (bottled), HEAD
Key value store for shared configuration and service discovery
https://github.com/etcd-io/etcd
/usr/local/Cellar/etcd/3.3.12 (9 files, 51.6MB) *
  Poured from bottle on 2019-02-16 at 13:41:00
From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/etcd.rb
$ env | grep -i etcd
ETCDCTL_API=3
$ etcdctl version
etcdctl version: 3.3.12
API version: 3.3

etcd is currently running in a single container on docker; the host machine has 4 ethernet ports two of which are bonded. The following is the issue:

$ etcdctl --endpoints=http://10.0.0.161:2379,http://10.0.0.162:2379,http://10.0.0.166:2379,http://br0.sagittarius.<lan.domain.com>:2379,http://eno1.sagittarius.<lan.domain.com>:2379,http://eno2.sagittarius.<lan.domain.com>:2379,http://etcd:2379 endpoint status
Failed to get the status of endpoint http://br0.sagittarius.<lan.domain.com>:2379 (context deadline exceeded)
Failed to get the status of endpoint http://eno2.sagittarius.<lan.domain.com>:2379 (context deadline exceeded)
Failed to get the status of endpoint http://etcd:2379 (context deadline exceeded)
http://10.0.0.161:2379, acd970e09a7f3cd1, 3.3.10, 23 MB, true, 14, 10350
http://10.0.0.162:2379, acd970e09a7f3cd1, 3.3.10, 23 MB, true, 14, 10350
http://10.0.0.166:2379, acd970e09a7f3cd1, 3.3.10, 23 MB, true, 14, 10350
http://eno1.sagittarius.<lan.domain.com>:2379, acd970e09a7f3cd1, 3.3.10, 23 MB, true, 14, 10350

Please ignore the http://etcd:2379 which is for docker networking purposes. However when testing a failing endpoint using http (from httpie) it seems to work:

$ http POST http://br0.sagittarius.<lan.domain.com>:2379/v3beta/cluster/member/list cluster=default
Content-Length: 475
Content-Type: application/json
Date: Sat, 13 Apr 2019 15:55:47 GMT

{
  "header": {
    "cluster_id": "13381000697838399546",
    "member_id": "12455110354436832465",
    "raft_term": "14"
  },
  "members": [
    {
      "ID": "12455110354436832465",
      "name": "sagittarius",
      "peerURLs": [
        "http://10.0.0.166:2379",
        "http://<domain.com>:2379"
      ],
      "clientURLs": [
        "http://10.0.0.161:2379",
        "http://10.0.0.162:2379",
        "http://10.0.0.166:2379",
        "http://br0.sagittarius.<lan.domain.com>:2379",
        "http://eno1.sagittarius.<lan.domain.com>:2379",
        "http://eno2.sagittarius.<lan.domain.com>:2379",
        "http://etcd:2379"
      ]
    }
  ]
}

The DNS that allows lan.domain.com is a local router running dnsmasq.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 3
  • Comments: 18

Most upvoted comments

I solved this issue only when I applied the right cert/key pair.

Assuming we’re using kubeadm to spin up the cluster, there should be a couple of cert/key pairs under the folder:

# ls -l /etc/kubernetes/pki/etcd/
total 32
-rw-r--r--    1 root     root          1017 Nov 12 15:32 ca.crt
-rw-------    1 root     root          1679 Nov 12 15:32 ca.key
-rw-r--r--    1 root     root          1094 Nov 12 15:32 healthcheck-client.crt
-rw-------    1 root     root          1675 Nov 12 15:32 healthcheck-client.key
-rw-r--r--    1 root     root          1180 Nov 12 15:32 peer.crt
-rw-------    1 root     root          1675 Nov 12 15:32 peer.key
-rw-r--r--    1 root     root          1180 Nov 12 15:32 server.crt
-rw-------    1 root     root          1679 Nov 12 15:32 server.key

# etcdctl --version
etcdctl version: 3.3.1
API version: 2

# ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
  --cacert /etc/kubernetes/pki/etcd/ca.crt \
  --cert /etc/kubernetes/pki/etcd/server.crt \
  --key /etc/kubernetes/pki/etcd/server.key
Snapshot saved at snapshot.db

# ETCDCTL_API=3 etcdctl --write-out=table snapshot status snapshot.db
+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| b9d500f7 |    72966 |       1194 |     4.9 MB |
+----------+----------+------------+------------+

Similar issue here (context deadline exceeded when using ETCDCTL_API=3). We cannot use API v2, because for that version etcdctl does not provide an --insecure-skip-tls-verify flag (which is the sole reason why we switched to ETCDCTL_API=3).

/cc @illuhad

had the same error. changing endpoints from http:// to https:// solved it for me.

I am having the same issue with 3.3.11. In my case auth is enabled. The Error: context deadline exceeded appears intermittently. When I run etcdctl and pass --user root and provide the password, 3 out of 5 times on I get the error. If auth is disabled, i don’t see the issue. I also see the error more often when a complex password is used something like 10 chars with special upper and lower case. With simple passwords I see the error less often. I am using only v3 in my case. v2 is disabled.