kops: error during connection attempt in dns-controller

created a cluster named aws-cn-north-1.test.k8s.local with master, everything works fine except there are some error messages in dns-controller:

I0716 05:26:20.117431       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:26:20.118023       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:29:01.454775       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:29:01.455543       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused

172.30.200.233 is one of the 3 masters. dns-controller started on 172.30.200.44, and listened 3998 successfully.

here is the full log:

dns-controller version 1.7.0
I0716 05:17:53.920475       1 main.go:201] Ingress controller disabled
I0716 05:17:53.920563       1 dnscontroller.go:101] starting DNS controller
I0716 05:17:53.920615       1 dnscontroller.go:154] scope not yet ready: pod
I0716 05:17:53.920730       1 gossip.go:99] Querying for seeds
I0716 05:17:53.920751       1 gossip.go:108] Got seeds: [127.0.0.1:3999]
I0716 05:17:53.920780       1 gossip.go:123] Seeding successful
I0716 05:17:53.920816       1 glogger.go:31] ->[127.0.0.1:3999] attempting connection
I0716 05:17:53.921208       1 nodecontroller.go:56] starting node controller
W0716 05:17:53.921238       1 nodecontroller.go:71] querying without field filter
I0716 05:17:53.921491       1 podcontroller.go:57] starting pod controller
W0716 05:17:53.928585       1 podcontroller.go:69] querying without label filter
W0716 05:17:53.928604       1 podcontroller.go:71] querying without field filter
I0716 05:17:53.928808       1 servicecontroller.go:55] starting service controller
W0716 05:17:53.928838       1 servicecontroller.go:67] querying without label filter
W0716 05:17:53.928853       1 servicecontroller.go:69] querying without field filter
I0716 05:17:53.931222       1 glogger.go:31] ->[127.0.0.1:3999|022352b0f9636aaf8bcf061ed1607c54(i-07e618ab37b3f7e76)]: connection ready; using protocol version 2
I0716 05:17:53.931276       1 glogger.go:31] ->[127.0.0.1:3999|022352b0f9636aaf8bcf061ed1607c54(i-07e618ab37b3f7e76)]: connection added (new peer)
I0716 05:17:53.933612       1 glogger.go:31] ->[172.30.200.69:3999] attempting connection
I0716 05:17:53.933724       1 glogger.go:31] ->[172.30.202.254:3999] attempting connection
I0716 05:17:53.933802       1 glogger.go:31] ->[172.30.202.217:3999] attempting connection
I0716 05:17:53.933894       1 glogger.go:31] ->[172.30.202.217:3998] attempting connection
I0716 05:17:53.933996       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:17:53.934078       1 glogger.go:31] ->[172.30.200.69:3998] attempting connection
I0716 05:17:53.949807       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:17:53.949859       1 glogger.go:31] ->[172.30.202.217:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.202.217:3998: getsockopt: connection refused
I0716 05:17:53.949973       1 glogger.go:31] ->[172.30.200.69:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.69:3998: getsockopt: connection refused
I0716 05:17:53.965213       1 glogger.go:31] ->[172.30.202.217:3999|715ab485f2cdb0e26a12b915932f8cc4(i-00618a8f727deeda2)]: connection ready; using protocol version 2
I0716 05:17:53.965240       1 glogger.go:31] ->[172.30.202.217:3999|715ab485f2cdb0e26a12b915932f8cc4(i-00618a8f727deeda2)]: connection added (new peer)
I0716 05:17:53.967949       1 glogger.go:31] ->[172.30.200.69:3999|7ed7ebe92e36c7e3a64d11d4ea98f86a(i-0ce3910ebb22ed6d8)]: connection ready; using protocol version 2
I0716 05:17:53.968042       1 glogger.go:31] ->[172.30.200.69:3999|7ed7ebe92e36c7e3a64d11d4ea98f86a(i-0ce3910ebb22ed6d8)]: connection added (new peer)
I0716 05:17:53.968249       1 glogger.go:31] ->[172.30.202.254:3999|05165392addcbb4c37b75abbf4bc575a(i-0c681a328fe8ba27b)]: connection ready; using protocol version 2
I0716 05:17:53.969091       1 glogger.go:31] ->[172.30.202.254:3999|05165392addcbb4c37b75abbf4bc575a(i-0c681a328fe8ba27b)]: connection added (new peer)
I0716 05:17:54.010329       1 dnscontroller.go:611] Update desired state: pod/kube-system/kube-apiserver-ip-172-30-200-233.cn-north-1.compute.internal: [{A api.internal.aws-cn-north-1.test.k8s.local. 172.30.200.233 false}]
I0716 05:17:54.010427       1 dnscontroller.go:611] Update desired state: pod/kube-system/kube-apiserver-ip-172-30-200-44.cn-north-1.compute.internal: [{A api.internal.aws-cn-north-1.test.k8s.local. 172.30.200.44 false}]
I0716 05:17:54.010457       1 dnscontroller.go:611] Update desired state: pod/kube-system/kube-apiserver-ip-172-30-202-217.cn-north-1.compute.internal: [{A api.internal.aws-cn-north-1.test.k8s.local. 172.30.202.217 false}]
W0716 05:17:54.010491       1 podcontroller.go:84] querying without label filter
W0716 05:17:54.010506       1 podcontroller.go:86] querying without field filter
I0716 05:17:54.011950       1 nodecontroller.go:80] node: ip-172-30-200-233.cn-north-1.compute.internal
I0716 05:17:54.012001       1 dnscontroller.go:611] Update desired state: node/ip-172-30-200-233.cn-north-1.compute.internal: [{A node/ip-172-30-200-233.cn-north-1.compute.internal/internal 172.30.200.233 true} {A node/role=master/internal 172.30.200.233 true} {A node/role=master/ ip-172-30-200-233.cn-north-1.compute.internal true} {A node/role=master/ ip-172-30-200-233.cn-north-1.compute.internal true}]
I0716 05:17:54.012030       1 nodecontroller.go:80] node: ip-172-30-200-44.cn-north-1.compute.internal
I0716 05:17:54.012051       1 dnscontroller.go:611] Update desired state: node/ip-172-30-200-44.cn-north-1.compute.internal: [{A node/ip-172-30-200-44.cn-north-1.compute.internal/internal 172.30.200.44 true} {A node/role=master/internal 172.30.200.44 true} {A node/role=master/ ip-172-30-200-44.cn-north-1.compute.internal true} {A node/role=master/ ip-172-30-200-44.cn-north-1.compute.internal true}]
I0716 05:17:54.012076       1 nodecontroller.go:80] node: ip-172-30-202-217.cn-north-1.compute.internal
I0716 05:17:54.012095       1 dnscontroller.go:611] Update desired state: node/ip-172-30-202-217.cn-north-1.compute.internal: [{A node/ip-172-30-202-217.cn-north-1.compute.internal/internal 172.30.202.217 true} {A node/role=master/internal 172.30.202.217 true} {A node/role=master/ ip-172-30-202-217.cn-north-1.compute.internal true} {A node/role=master/ ip-172-30-202-217.cn-north-1.compute.internal true}]
W0716 05:17:54.012118       1 nodecontroller.go:87] querying without field filter
W0716 05:17:54.019086       1 servicecontroller.go:82] querying without label filter
W0716 05:17:54.019096       1 servicecontroller.go:84] querying without field filter
I0716 05:17:56.882236       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:17:56.882738       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:17:58.458065       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:17:58.458519       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:17:58.921076       1 dnscache.go:75] querying all DNS zones (no cached results)
I0716 05:17:58.921240       1 dnscontroller.go:256] Using default TTL of 1m0s
I0716 05:17:58.921276       1 dnscontroller.go:421] Querying all dnsprovider records for zone "local"
I0716 05:17:58.921310       1 dnscontroller.go:571] Adding DNS changes to batch {A api.internal.aws-cn-north-1.test.k8s.local.} [172.30.200.233 172.30.200.44 172.30.202.217]
I0716 05:17:58.921366       1 dnscontroller.go:301] applying DNS changeset for zone local::gossip:local
I0716 05:17:58.921408       1 gossip.go:136] UpdateValues: remove=[], put=map[dns/local/A/api.internal.aws-cn-north-1.test.k8s.local:172.30.200.233,172.30.200.44,172.30.202.217]
I0716 05:18:04.473278       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:18:04.473925       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:18:04.697878       1 dnscontroller.go:611] Update desired state: node/ip-172-30-202-254.cn-north-1.compute.internal: [{A node/ip-172-30-202-254.cn-north-1.compute.internal/internal 172.30.202.254 true} {A node/role=node/internal 172.30.202.254 true} {A node/role=node/ ip-172-30-202-254.cn-north-1.compute.internal true} {A node/role=node/ ip-172-30-202-254.cn-north-1.compute.internal true}]
I0716 05:18:05.044277       1 dnscontroller.go:611] Update desired state: node/ip-172-30-200-69.cn-north-1.compute.internal: [{A node/ip-172-30-200-69.cn-north-1.compute.internal/internal 172.30.200.69 true} {A node/role=node/internal 172.30.200.69 true} {A node/role=node/ ip-172-30-200-69.cn-north-1.compute.internal true} {A node/role=node/ ip-172-30-200-69.cn-north-1.compute.internal true}]
I0716 05:18:12.390400       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:18:12.390888       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:18:22.771027       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:18:22.771504       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:18:31.441762       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:18:31.442319       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:18:51.628682       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:18:51.629236       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:19:41.704499       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:19:41.708844       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:20:48.657883       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:20:48.659139       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:22:33.485536       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:22:33.486070       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:24:47.571089       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:24:47.571637       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:26:20.117431       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:26:20.118023       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused
I0716 05:29:01.454775       1 glogger.go:31] ->[172.30.200.233:3998] attempting connection
I0716 05:29:01.455543       1 glogger.go:31] ->[172.30.200.233:3998] error during connection attempt: dial tcp4 0.0.0.0:0->172.30.200.233:3998: getsockopt: connection refused

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 21 (9 by maintainers)

Most upvoted comments

here is security group created by kops

$ aws ec2 describe-security-groups --group-ids sg-4c62c034
{
    "SecurityGroups": [
        {
            "IpPermissionsEgress": [
                {
                    "UserIdGroupPairs": [],
                    "PrefixListIds": [],
                    "Ipv6Ranges": [],
                    "IpProtocol": "-1",
                    "IpRanges": [
                        {
                            "CidrIp": "0.0.0.0/0"
                        }
                    ]
                }
            ],
            "OwnerId": "xxxxxxx",
            "Description": "Security group for nodes",
            "GroupId": "sg-4c62c034",
            "GroupName": "nodes.xxxxxxx.k8s.local",
            "VpcId": "xxxxxxx",
            "Tags": [
                {
                    "Value": "owned",
                    "Key": "kubernetes.io/cluster/xxxxxxx.k8s.local"
                },
                {
                    "Value": "xxxxxxx.k8s.local",
                    "Key": "KubernetesCluster"
                },
                {
                    "Value": "nodes.xxxxxxx.k8s.local",
                    "Key": "Name"
                }
            ],
            "IpPermissions": [
                {
                    "UserIdGroupPairs": [
                        {
                            "UserId": "xxxxxxx",
                            "GroupId": "sg-4c62c034"
                        },
                        {
                            "UserId": "xxxxxxx",
                            "GroupId": "sg-819e3cf9"
                        }
                    ],
                    "PrefixListIds": [],
                    "Ipv6Ranges": [],
                    "IpProtocol": "-1",
                    "IpRanges": []
                },
                {
                    "PrefixListIds": [],
                    "IpProtocol": "tcp",
                    "UserIdGroupPairs": [],
                    "FromPort": 22,
                    "Ipv6Ranges": [],
                    "IpRanges": [
                        {
                            "CidrIp": "0.0.0.0/0"
                        }
                    ],
                    "ToPort": 22
                }
            ]
        }
    ]
}

@fejta @chrislovecnm Could you reopen this?

We are seeing a similar issue on k8s 1.12.9. We noticed the issue only after adding a few new nodes recently. The 7th node added seems to be causing this issue, none of the nodes added to the cluster earlier seem to appear in the dns-controller’s logs with the same issue.

Destroyed the trouble node. Its replacement is causing the same issue.

Did you end up finding a fix for this?

We’re on k8s 1.16.7 and seeing this issue on the 3rd master node that has been added, and destroying it doesn’t seem to help either since the node that comes back throws the same error.