linkerd2: Received http2 header with status: 502
Bug Report
What is the issue?
Our application shows a couple of errors that seem to be related to linkerd-proxy
( without linkerd injection these errors don’t happen):
"code": 4,
"metadata": {
"_internal_repr": {}
},
"details": "Deadline Exceeded"
}
followed by
err: {
"code": 1,
"metadata": {
"_internal_repr": {
":status": [
"502"
],
"content-length": [
"0"
],
"date": [
"Tue, 11 Jun 2019 13:01:30 GMT"
]
}
},
"details": "Received http2 header with status: 502"
}
On linkerd-proxy
pods we can see the following errors:
DBUG [ 388.710739s] proxy={client=in dst=10.24.26.128:9090 proto=Http2} linkerd2_proxy::proxy::http::h2 http2 conn error: http2 error: protocol error: unspecific protocol error detected
ERR! [ 388.710816s] proxy={server=in listen=0.0.0.0:4143 remote=10.24.23.122:35662} linkerd2_proxy::app::errors unexpected error: http2 error: protocol error: unspecific protocol error detected
Based on the logs it seems like service A
sends a request to service B
via linkerd-proxy
, service B
processes the request successfully (both service B
logs and linkerd-proxy
of service B show no errors and data is written further downstream in the database). However this is where the issue is reported - linkerd-proxy
of service A instead of reporting the request successfully it shows and the linkerd2_proxy::app::errors unexpected error: http2 error: protocol error: unspecific protocol error detected
error and the service A
logs show the above application errors.
How can it be reproduced?
Not 100% sure - we were seeing this issue when we started using linkerd but after we applied the fix described in this ticket https://github.com/linkerd/linkerd2/issues/2813#issuecomment-496641996 we stopped seeing this issue until today.
I’ve attempted upgrading linkerd with --disable-h2-upgrade
and I can still see the issue.
Logs, error output, etc
(If the output is long, please create a gist and paste the link here.)
linkerd check
output
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
linkerd-existence
-----------------
√ control plane namespace exists
√ controller pod is running
√ can initialize the client
√ can query the control plane API
linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ no invalid service profiles
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date
control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match
Status check results are √
Environment
- Kubernetes Version:v1.14.1
- Cluster Environment: (GKE, AKS, kops, …) GKE v1.12.8-gke.6
- Host OS:
- Linkerd version: tested with edge-19.6.1, stable-2.3.2, and
--proxy-version=fix-2863-0
(which contained the memory leak fix)
Possible solution
Not a solution but internally we think that linkerd is adding the status
field wrapped in :
and grpc
doesn’t like that.
Additional context
There isn’t a pattern or much consistency with this - every time we run the job the error is returned at the different point in the flow.
This seems to be a very similar issue to what is described in https://github.com/linkerd/linkerd2/issues/2801
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 21 (12 by maintainers)
@siggy @seanmonstar @olix0r I am super pleased to confirm that the changes you guys have rolled out have fixed the 502 issues. I’ve solved our internal issue and have managed to test multiple times with
edge-19.7.1
and once withedge-19.7.2
. Thank you for your assistance 🙏