linkerd2: invalid HTTP method parsed

Bug Report

What is the issue?

After upgrading Linkerd to stable-2.10.1 some proxies (not all) are logging following lines:

{"timestamp":"[   517.977900s]","level":"INFO","fields":{"message":"Connection closed","error":"invalid HTTP method parsed"},"target":"linkerd_app_core::serve","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}

About 2 messages per second.

How can it be reproduced?

Not sure how to reproduce, in my case at least 2 clusters were affected by this, haven’t upgraded prod cluster.

Logs, error output, etc

I enabled debug logs as well then the output is following:

{"timestamp":"[   517.977546s]","level":"DEBUG","fields":{"message":"Peeked bytes from TCP stream","sz":289},"target":"linkerd_tls::server","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}
{"timestamp":"[   517.977605s]","level":"DEBUG","fields":{"message":"Identified foreign SNI","sni":"..."},"target":"linkerd_tls::server","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}
{"timestamp":"[   517.977643s]","level":"DEBUG","fields":{"message":"DetectResult","protocol":"None","elapsed":"7.9µs"},"target":"linkerd_detect","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}
{"timestamp":"[   517.977663s]","level":"DEBUG","fields":{"message":"Failed to parse HTTP request; handling as HTTP/1"},"target":"linkerd_app::admin","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}
{"timestamp":"[   517.977687s]","level":"DEBUG","fields":{"message":"Creating HTTP service","version":"Http1"},"target":"linkerd_proxy_http::server","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}
{"timestamp":"[   517.977717s]","level":"DEBUG","fields":{"message":"Handling as HTTP","version":"Http1"},"target":"linkerd_proxy_http::server","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}
{"timestamp":"[   517.977831s]","level":"DEBUG","fields":{"message":"The client is shutting down the connection","res":"Err(hyper::Error(Parse(Method)))"},"target":"linkerd_proxy_http::server","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}
{"timestamp":"[   517.977900s]","level":"INFO","fields":{"message":"Connection closed","error":"invalid HTTP method parsed"},"target":"linkerd_app_core::serve","spans":[{"name":"daemon"},{"listen.addr":"0.0.0.0:4191","name":"admin"},{"client.addr":"10.0.5.162:51408","name":"accept"}],"threadId":"ThreadId(2)"}

IP 10.0.5.162 belongs to linkerd-viz.prometheus.

linkerd check output

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days

linkerd-api
-----------
√ control plane pods are ready
√ can initialize the client
√ can query the control plane API

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match

Status check results are √

Linkerd extensions checks
=========================

linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ prometheus is installed and configured correctly
√ can initialize the client
√ viz extension self-check

Status check results are √

Environment

  • Kubernetes Version: v1.18.9-eks-d1db3c
  • Cluster Environment: AWS EKS
  • Host OS: Ubuntu 20.04.1 LTS kernel: 5.4.0-1029-aws
  • Linkerd version: 2.10.1

Possible solution

Additional context

I understand this is harmless when looking debug log. But it will create a lot of additional log and affect the logging collectors/storage and etc.

When I disable linker-proxy for linkerd-viz.prometheus pods then there are no such messages, also when I add following annotation:

config.linkerd.io/skip-outbound-ports: "4191"

It looks like proxy is affecting somehow outbound connections. This might explain this issue as well.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 20 (14 by maintainers)

Most upvoted comments

@lupusllc Thanks for adding in here. This issue should be fixed in the edge that was released yesterday edge-21.4.5. This will also be available in a stable-2.10.2 at some point soon. I’ll wait to close this issue until that is released.

For those who have been involved in this issue, this was fixed in tower-rs/tower#581 which is a proxy dependency. Reconnects—which are the background tasks in this context—were not being aborted properly, so they would get stuck in an infinite reconnect loop.

This most often happened with Viz’s Prometheus and resulted in the errors a lot of you have seen. What would happen is a pod 10.0.0.1 would be scraped by Prometheus at 10.0.0.1:4191 with some SNI foo. After some point, that pod is terminated but Prometheus’s proxy keeps trying to reconnect to 10.0.0.1:4191. Eventually, a new pod takes that IP and all of a sudden the Prometheus’s proxy starts sending requests to the new pod at 10.0.0.1:4191 with foo SNI. But since this is a new pod, it may have bar identity. This results in the error.

This issue is not specific to Prometheus’s proxy so we’re hoping this also cleans up a lot of the fail fast errors in general.

@olix0r I deployed edge-21.4.5 in our dev env this morning (previously running stable-2.10.1, and the log messages have disappeared. Went from ~10K every 15m to under 200.

Hi, we have the same issue after upgrading to linkerd 2.10.1 in our testing environments. Logs have many “Connection closed error=invalid HTTP method parsed” errors. Pods seem to be randomly affected, but in general the issue is reproducible, when I restart some deployment’s pods couple of times. I enabled tracing configuration for one of the deployments and this is what I got from logs:

[   276.292622s] TRACE ThreadId(02) linkerd_proxy_http::detect: Reading capacity=1024
[   276.292649s] TRACE ThreadId(02) linkerd_proxy_http::detect: Read sz=301
[   276.292655s] TRACE ThreadId(02) linkerd_proxy_http::detect: Checking H2 preface
[   276.292659s] TRACE ThreadId(02) linkerd_proxy_http::detect: Parsing HTTP/1 message
[   276.292664s] TRACE ThreadId(02) linkerd_proxy_http::detect: Not HTTP
[   276.292669s] DEBUG ThreadId(02) linkerd_proxy_http::server: Creating HTTP service version=Http1
[   276.292683s] DEBUG ThreadId(02) linkerd_proxy_http::server: Handling as HTTP version=Http1
[   276.292702s] TRACE ThreadId(02) hyper::proto::h1::conn: Conn::read_head
[   276.292717s] TRACE ThreadId(02) parse_headers: hyper::proto::h1::role: Request.parse([Header; 100], [u8; 301])
[   276.292729s] TRACE ThreadId(02) hyper::proto::h1::conn: State::close_read()
[   276.292734s] DEBUG ThreadId(02) hyper::proto::h1::conn: parse error (invalid HTTP method parsed) with 301 bytes
[   276.292739s] DEBUG ThreadId(02) hyper::proto::h1::role: sending automatic response (400 Bad Request) for parse error
[   276.292747s] TRACE ThreadId(02) encode_headers: hyper::proto::h1::role: Server::encode status=400, body=None, req_method=None
[   276.292786s] DEBUG ThreadId(02) hyper::proto::h1::io: flushed 84 bytes
[   276.292811s] TRACE ThreadId(02) hyper::proto::h1::conn: flushed({role=server}): State { reading: Closed, writing: Closed, keep_alive: Disabled, error: hyper::Error(Parse(Method)) }
[   276.292839s] TRACE ThreadId(02) hyper::proto::h1::conn: shut down IO complete
[   276.292884s] DEBUG ThreadId(02) linkerd_proxy_http::server: The client is shutting down the connection res=Err(hyper::Error(Parse(Method)))
[   276.292987s]  INFO ThreadId(02) linkerd_app_core::serve: Connection closed error=invalid HTTP method parsed
[   276.482025s] TRACE ThreadId(01) hyper::client::pool: idle interval checking for expired
[   276.794489s] TRACE ThreadId(02) linkerd_proxy_http::detect: Reading capacity=1024
[   276.794516s] TRACE ThreadId(02) linkerd_proxy_http::detect: Read sz=301
[   276.794520s] TRACE ThreadId(02) linkerd_proxy_http::detect: Checking H2 preface
[   276.794524s] TRACE ThreadId(02) linkerd_proxy_http::detect: Parsing HTTP/1 message
[   276.794527s] TRACE ThreadId(02) linkerd_proxy_http::detect: Not HTTP
[   276.794532s] DEBUG ThreadId(02) linkerd_proxy_http::server: Creating HTTP service version=Http1
[   276.794544s] DEBUG ThreadId(02) linkerd_proxy_http::server: Handling as HTTP version=Http1
[   276.794565s] TRACE ThreadId(02) hyper::proto::h1::conn: Conn::read_head
[   276.794579s] TRACE ThreadId(02) parse_headers: hyper::proto::h1::role: Request.parse([Header; 100], [u8; 301])
[   276.794603s] TRACE ThreadId(02) hyper::proto::h1::conn: State::close_read()
[   276.794609s] DEBUG ThreadId(02) hyper::proto::h1::conn: parse error (invalid HTTP method parsed) with 301 bytes
[   276.794613s] DEBUG ThreadId(02) hyper::proto::h1::role: sending automatic response (400 Bad Request) for parse error
[   276.794621s] TRACE ThreadId(02) encode_headers: hyper::proto::h1::role: Server::encode status=400, body=None, req_method=None
[   276.794672s] DEBUG ThreadId(02) hyper::proto::h1::io: flushed 84 bytes
[   276.794686s] TRACE ThreadId(02) hyper::proto::h1::conn: flushed({role=server}): State { reading: Closed, writing: Closed, keep_alive: Disabled, error: hyper::Error(Parse(Method)) }
[   276.794718s] TRACE ThreadId(02) hyper::proto::h1::conn: shut down IO complete
[   276.794736s] DEBUG ThreadId(02) linkerd_proxy_http::server: The client is shutting down the connection res=Err(hyper::Error(Parse(Method)))
[   276.794780s]  INFO ThreadId(02) linkerd_app_core::serve: Connection closed error=invalid HTTP method parsed

I hope it helps to troubleshoot, thanks.