linkerd2: Transient OpenSSL errors when Linkerd is injected (no peer certificate available) (SSLv3/TLS write client hello)
Bug Report
What is the issue?
When Linkerd2 sidecar is injected, periodically requests to external endpoints fail with SSL errors.
-
In Ruby the issue looks like:
OpenSSL::SSL::SSLError: SSL_connect SYSCALL returned=5 errno=0 state=SSLv3/TLS write client hello
-
In linkerd-debug no visible symptoms
-
In openssl cli the error starts from the message
no peer certificate available
How can it be reproduced?
Start a new pod with the injected sidecar with a command that does nothing (tail -f /dev/null
).
Exec inside and run the following script:
#!/usr/bin/env bash
echo "" > out_2
while :
do
echo "*****" >> out_2
echo "GET /" | openssl s_client -connect www.example.com:443 -no_tls1_1 >> out_2 # -no_tls1_1 is not required to reproduce the error
# curl -vvv https://bing.com/ >> out_2 2>&1
# wget -O- https://google.com >/dev/null
ret=$?
if [ $ret -ne 0 ]; then
echo "!!!!!!" >> out_2
exit
fi
done
Logs, error output, etc
Here are output results from the script. A good result includes server certificate chain. A bad one includes no peer certificate available
.
https://gist.github.com/KIVagant/37b87245b27810f359acb22fdfa4c13b
When linkerd proxy is uninjected, the error never appears.
linkerd check
output
# all is green, except the version is not the latest available
‼ control plane is up-to-date
is running version 2.7.1 but the latest stable version is 2.8.1
see https://linkerd.io/checks/#l5d-version-control for hints
Environment
- Kubernetes Version:
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.9-eks-f459c0"
- Cluster Environment: EKS
- Host OS: Amazon Linux
- Linkerd version:
Client version: stable-2.7.1
Server version: stable-2.7.1
Possible solution
Additional context
We see the issue in many applications, it appears in many places hundreds of times a day randomly.
Also we never saw this with Linkerd v1
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 19 (16 by maintainers)
A quick update: this appears to be a bug in an underlying library (
tokio
and/ormio
) — I can reproduce it with code that usestokio
without Linkerd. Will investigate further and fix it upstream.Hmm, the results with
openssl
/example.com is very different thancurl
/bing.Proxy log
OpenSSL client output (one good request prior to the failure)
:tshark capture
It’s like reading a Stephen King novel!