linkerd2: linkerd_reconnect: Failed to connect error=Connection refused (os error 111) after installing 2.11.1

What is the issue?

I am seeing below errors in linkerd destination pod after installing 2.11.1 in aks cluster, below are the logs from the pod. we previously used 2.10 without any issues, we did not upgrade but installed 2.11 after removing 2.10. Please let me know if any logs are required for troubleshooting.

How can it be reproduced?

Installing new 2.11.1 linkerd version

Logs, error output, etc

[     0.001240s]  INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime
[     0.001570s]  INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191
[     0.001583s]  INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143
[     0.001586s]  INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140
[     0.001587s]  INFO ThreadId(01) linkerd2_proxy: Tap interface on 0.0.0.0:4190
[     0.001589s]  INFO ThreadId(01) linkerd2_proxy: Local identity is linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
[     0.001591s]  INFO ThreadId(01) linkerd2_proxy: Identity verified via linkerd-identity-headless.linkerd.svc.cluster.local:8080 (linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local)
[     0.001593s]  INFO ThreadId(01) linkerd2_proxy: Destinations resolved via localhost:8086
[     0.002035s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[     0.003857s]  WARN ThreadId(02) identity:controller{addr=linkerd-identity-headless.linkerd.svc.cluster.local:8080}: linkerd_app_core::control: Failed to resolve control-plane component error=no record found for name: linkerd-identity-headless.linkerd.svc.cluster.local. type: SRV class: IN
[     0.112761s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[     0.332287s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[     0.738942s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[     1.240545s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[     1.742524s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[     2.067324s]  INFO ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}: linkerd_app_core::serve: Connection closed error=TLS detection timed out
[    72.931085s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    73.431840s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    73.932647s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    74.434329s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    74.936055s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    75.436486s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    75.938267s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    76.440036s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    76.940731s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    77.442481s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    77.944229s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    78.444986s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    78.945760s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    79.446562s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    79.948405s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    80.450125s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    80.950652s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    81.452416s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    81.954388s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    82.456028s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    82.957336s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)
[    83.459112s]  WARN ThreadId(01) policy:watch{port=8090}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111

output of linkerd check -o short

 ~ linkerd check
Linkerd core checks
===================

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days

linkerd-version
---------------
‼ can determine the latest version
    Get "https://versioncheck.linkerd.io/version.json?version=stable-2.11.1&uuid=58eb0377-e4d1-43a5-8baf-9c9c44545559&source=cli": net/http: TLS handshake timeout
    see https://linkerd.io/2.11/checks/#l5d-version-latest for hints
‼ cli is up-to-date
    unsupported version channel: stable-2.11.1
    see https://linkerd.io/2.11/checks/#l5d-version-cli for hints

control-plane-version
---------------------
√ can retrieve the control plane version
‼ control plane is up-to-date
    unsupported version channel: stable-2.11.1
    see https://linkerd.io/2.11/checks/#l5d-version-control for hints
√ control plane and cli versions match

linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
‼ control plane proxies are up-to-date
    some proxies are not running the current version:
	* linkerd-destination-7d9d7865ff-8kkzh (stable-2.11.1)
	* linkerd-identity-5f8f46575-fdzjb (stable-2.11.1)
	* linkerd-proxy-injector-56fd45796f-8m7cx (stable-2.11.1)
    see https://linkerd.io/2.11/checks/#l5d-cp-proxy-version for hints
√ control plane proxies and cli versions match

Status check results are √

Linkerd extensions checks
=========================

linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
‼ linkerd-viz pods are injected
    could not find proxy container for prometheus-86bdfbd9d6-z55qz pod
    see https://linkerd.io/2.11/checks/#l5d-viz-pods-injection for hints
‼ viz extension pods are running
    prometheus-86bdfbd9d6-24t68 status is Failed
    see https://linkerd.io/2.11/checks/#l5d-viz-pods-running for hints
× viz extension proxies are healthy
    The "linkerd-proxy" container in the "prometheus-86bdfbd9d6-24t68" pod is not ready
    see https://linkerd.io/2.11/checks/#l5d-viz-proxy-healthy for hints

Environment

  • k8s version – 1.20
  • cluster env – AKS
  • Host OS – linux
  • Linkerd Version – 2.11.1

I also see these errors in linkerd proxy pod logs

[ 33.007036s] WARN ThreadId(01) policy:watch{port=8080}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}: linkerd_app_core::control: Failed to resolve control-plane component error=no record found for name: linkerd-policy.linkerd.svc.cluster.local. type: SRV class: IN

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 4
  • Comments: 24 (7 by maintainers)

Most upvoted comments

I’m using EKS and getting this error too

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

Hey @alpeb any updates here, we’re stuck on an upgrade