linkerd2: latest edge-21.1.3 breaks tcp mTLS in RabbitMQ
Bug Report
What is the issue?
Using linkerd stable-2.9.0 to mesh a RabbitMQ cluster mTLSed endpoints works perfectly out of the box (no port annotations), upgrading to edge-21.1.3 causes RabbitMQ to break
linkerd now throws this linkerd_proxy_http::detect: Could not detect protocol read=9
How can it be reproduced?
install edge-21.1.3 to a rabbitMQ 3 node cluster check the logs
Logs, error output, etc
linkerd-proxy
[ 21.805559s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}: linkerd_proxy_http::detect: Could not detect protocol read=9
[ 21.805598s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}: linkerd_detect: Detected protocol=None elapsed=6.990899484s
[ 21.805615s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}: linkerd_cache: Caching new service target=(None, Logical { orig_dst: 10.0.100.168:4369, protocol: (), profile: Some(..) })
[ 21.805718s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}:tcp: linkerd_tls::client: Initiating TLS connection target.id="rabbitmq.default.serviceaccount.identity.linkerd.cluster.local"
[ 21.805735s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}:tcp: linkerd_proxy_transport::connect: Connecting peer.addr=10.0.100.168:4369
[ 21.806406s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}:tcp: linkerd_proxy_transport::connect: Connected local.addr=10.0.101.183:44588 keepalive=Some(10s)
[ 21.808881s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}:tcp: linkerd_proxy_transport::metrics: client connection open
[ 21.809905s] INFO ThreadId(01) outbound:accept{peer.addr=10.0.101.183:48423 target.addr=10.0.100.168:4369}: linkerd_app_core::serve: Connection closed error=server: Transport endpoint is not connected (os error 107)
[ 21.809935s] DEBUG ThreadId(01) evict{target=(None, Logical { orig_dst: 10.0.100.168:4369, protocol: (), profile: Some(..) })}: linkerd_cache: Awaiting idleness
[ 21.809943s] DEBUG ThreadId(01) evict{target=Accept { orig_dst: 10.0.100.168:4369, protocol: () }}: linkerd_cache: Awaiting idleness
[ 24.578095s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:34884 target.addr=10.0.101.183:4191}: linkerd_tls::server: Peeked bytes from TCP stream sz=119
[ 24.578307s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:34884 target.addr=10.0.101.183:4191}: linkerd_app_core::serve: Connection closed
[ 24.816187s] DEBUG ThreadId(01) dst: linkerd_dns: resolve_srv name=linkerd-dst-headless.linkerd.svc.cluster.local
[ 24.819857s] DEBUG ThreadId(01) dst: linkerd_dns: addrs=[10.0.100.63:8086]
[ 24.819888s] DEBUG ThreadId(01) dst: linkerd_proxy_dns_resolve: addrs=[10.0.100.63:8086]
[ 26.810831s] DEBUG ThreadId(01) evict{target=(None, Logical { orig_dst: 10.0.100.168:4369, protocol: (), profile: Some(..) })}: linkerd_cache: Cache entry dropped
[ 28.066414s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:34936 target.addr=10.0.101.183:4191}: linkerd_tls::server: Peeked bytes from TCP stream sz=120
[ 28.066578s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:34936 target.addr=10.0.101.183:4191}: linkerd_app_core::serve: Connection closed
[ 31.489621s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}: linkerd_proxy_http::detect: Could not detect protocol read=9
[ 31.489656s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}: linkerd_detect: Detected protocol=None elapsed=7.000235845s
[ 31.489665s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}: linkerd_cache: Caching new service target=(None, Logical { orig_dst: 10.0.100.168:4369, protocol: (), profile: Some(..) })
[ 31.489744s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}:tcp: linkerd_tls::client: Initiating TLS connection target.id="rabbitmq.default.serviceaccount.identity.linkerd.cluster.local"
[ 31.489755s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}:tcp: linkerd_proxy_transport::connect: Connecting peer.addr=10.0.100.168:4369
[ 31.490529s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}:tcp: linkerd_proxy_transport::connect: Connected local.addr=10.0.101.183:44692 keepalive=Some(10s)
[ 31.491475s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}:tcp: linkerd_proxy_transport::metrics: client connection open
[ 31.492797s] INFO ThreadId(01) outbound:accept{peer.addr=10.0.101.183:58667 target.addr=10.0.100.168:4369}: linkerd_app_core::serve: Connection closed error=server: Transport endpoint is not connected (os error 107)
[ 31.492832s] DEBUG ThreadId(01) evict{target=(None, Logical { orig_dst: 10.0.100.168:4369, protocol: (), profile: Some(..) })}: linkerd_cache: Awaiting idleness
[ 34.578075s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:34988 target.addr=10.0.101.183:4191}: linkerd_tls::server: Peeked bytes from TCP stream sz=119
[ 34.578404s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:34988 target.addr=10.0.101.183:4191}: linkerd_app_core::serve: Connection closed
[ 36.494443s] DEBUG ThreadId(01) evict{target=(None, Logical { orig_dst: 10.0.100.168:4369, protocol: (), profile: Some(..) })}: linkerd_cache: Cache entry dropped
[ 38.066389s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:35040 target.addr=10.0.101.183:4191}: linkerd_tls::server: Peeked bytes from TCP stream sz=120
[ 38.066579s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{peer.addr=10.0.101.161:35040 target.addr=10.0.101.183:4191}: linkerd_app_core::serve: Connection closed
[ 38.539670s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:43655 target.addr=10.0.100.168:4369}: linkerd_proxy_http::detect: Could not detect protocol read=9
[ 38.539729s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:43655 target.addr=10.0.100.168:4369}: linkerd_detect: Detected protocol=None elapsed=7.000989191s
[ 38.539739s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:43655 target.addr=10.0.100.168:4369}: linkerd_cache: Caching new service target=(None, Logical { orig_dst: 10.0.100.168:4369, protocol: (), profile: Some(..) })
[ 38.539835s] DEBUG ThreadId(01) outbound:accept{peer.addr=10.0.101.183:43655 target.addr=10.0.100.168:4369}:tcp: linkerd_tls::client: Initiating TLS connection target.id="rabbitmq.default.serviceaccount.identity.linkerd.cluster.local"
RabbitMQ
## ## RabbitMQ 3.7.3. Copyright (C) 2007-2018 Pivotal Software, Inc.
########## Licensed under the MPL. See http://www.rabbitmq.com/
###### ##
########## Logs: <stdout>
Starting broker...
2021-01-25 14:49:07.540 [info] <0.264.0>
node : rabbit@rabbitmq-2.rabbitmq.default.svc.cluster.local
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.conf
cookie hash : G9xFiGrZ1fE/RXO9dM2I7Q==
log(s) : <stdout>
database dir : /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-2.rabbitmq.default.svc.cluster.local
2021-01-25 14:49:08.108 [info] <0.270.0> Memory high watermark set to 1953 MiB (2048000000 bytes) of 3072 MiB (3221225472 bytes) total
2021-01-25 14:49:08.112 [info] <0.272.0> Enabling free disk space monitoring
2021-01-25 14:49:08.112 [info] <0.272.0> Disk free limit set to 50MB
2021-01-25 14:49:08.115 [info] <0.274.0> Limiting to approx 1048476 file handles (943626 sockets)
2021-01-25 14:49:08.115 [info] <0.275.0> FHC read buffering: OFF
2021-01-25 14:49:08.115 [info] <0.275.0> FHC write buffering: ON
2021-01-25 14:49:29.137 [info] <0.264.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2021-01-25 14:50:04.950 [info] <0.33.0> Application lager started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
14:49:59.138 [warning] <0.264.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
rabbitmq 2021-01-25 14:49:59.138 [info] <0.264.0> Waiting for Mnesia tables for 30000 ms, 8 retries left
14:50:27.910 [info] <0.33.0> Application xmerl started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
rabbitmq 2021-01-25 14:50:27.911 [info] <0.33.0> Application os_mon started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:27.911 [info] <0.33.0> Application jsx started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:29.139 [warning] <0.264.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
rabbitmq 2021-01-25 14:50:29.139 [info] <0.264.0> Waiting for Mnesia tables for 30000 ms, 7 retries left
[info] <0.33.0> Application inets started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.958 [info] <0.33.0> Application mnesia started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.958 [info] <0.33.0> Application amqp10_common started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.958 [info] <0.33.0> Application crypto started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.959 [info] <0.33.0> Application cowlib started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.959 [info] <0.33.0> Application asn1 started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.959 [info] <0.33.0> Application public_key started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.960 [info] <0.33.0> Application ssl started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.967 [info] <0.33.0> Application amqp10_client started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.967 [info] <0.33.0> Application ranch started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.973 [info] <0.33.0> Application cowboy started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.973 [info] <0.33.0> Application ranch_proxy_protocol started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.973 [info] <0.33.0> Application recon started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.973 [info] <0.33.0> Application rabbit_common started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.978 [info] <0.33.0> Application amqp_client started on node 'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2021-01-25 14:50:55.979 [info] <0.314.0>
Starting RabbitMQ 3.7.3 on Erlang 20.1.7
Copyright (C) 2007-2018 Pivotal Software, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
## ##
## ## RabbitMQ 3.7.3. Copyright (C) 2007-2018 Pivotal Software, Inc.
########## Licensed under the MPL. See http://www.rabbitmq.com/
###### ##
########## Logs: <stdout>
Starting broker...
2021-01-25 14:50:55.980 [info] <0.314.0>
node : rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.conf
cookie hash : G9xFiGrZ1fE/RXO9dM2I7Q==
log(s) : <stdout>
database dir : /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local
2021-01-25 14:50:56.720 [info] <0.320.0> Memory high watermark set to 1953 MiB (2048000000 bytes) of 3072 MiB (3221225472 bytes) total
2021-01-25 14:50:56.724 [info] <0.322.0> Enabling free disk space monitoring
2021-01-25 14:50:56.724 [info] <0.322.0> Disk free limit set to 50MB
2021-01-25 14:50:56.727 [info] <0.325.0> FHC read buffering: OFF
2021-01-25 14:50:56.728 [info] <0.325.0> FHC write buffering: ON
2021-01-25 14:50:56.728 [info] <0.324.0> Limiting to approx 1048476 file handles (943626 sockets)
2021-01-25 14:50:59.140 [warning] <0.264.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
2021-01-25 14:50:59.140 [info] <0.264.0> Waiting for Mnesia tables for 30000 ms, 6 retries left
2021-01-25 14:51:07.604 [warning] <0.268.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
2021-01-25 14:51:07.604 [info] <0.268.0> Waiting for Mnesia tables for 30000 ms, 8 retries left
2021-01-25 14:51:29.141 [warning] <0.264.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
2021-01-25 14:51:29.141 [info] <0.264.0> Waiting for Mnesia tables for 30000 ms, 5 retries left
2021-01-25 14:51:37.605 [warning] <0.268.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
2021-01-25 14:51:37.605 [info] <0.268.0> Waiting for Mnesia tables for 30000 ms, 7 retries left
2021-01-25 14:51:52.767 [info] <0.314.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2021-01-25 14:51:52.831 [info] <0.314.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
Environment
-
Kubernetes Version:
-
1.16
-
Cluster Environment: (GKE, AKS, kops, …)
-
EKS
-
Linkerd version:
-
edge-21.1.3
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 15 (13 by maintainers)
@olix0r I have confirmed edge-21.2.1 and rabbitmq works out of the box again . Thank you so much