envoy: Envoy HTTP2 Connection sends RST_STREAM before initial HEADERS which causes upstream server to close connection with PROTOCOL_ERROR

Description

Preface: we’re using envoy as an L7 proxy with a HTTP2 upstream. We’ve noticed that sometimes Envoy sometimes sends a RST_STREAM before actually sending the HEADERS frame to change the stream state from idle to open.

The following is a pcap of a single TCP connection between envoy and the upstream server. The upstream server is listening on 8080 while the ephemeral port that envoy cx is using is 63692.

Pcap file

image

On #243, we can see that envoy starts sending RST_STREAM for streams that haven’t received HEADERS or PUSH-PROMISE frames. Additionally, some of these stream resets are repeated (e.g. stream id 27, 23 from above). Because of this the server closes the connection with a PROTOCOL_ERROR as per HTTP2 spec.

Excerpt:

RST_STREAM frames MUST NOT be sent for a stream in the “idle” state. If a RST_STREAM frame identifying an idle stream is received, the recipient MUST treat this as a connection error (Section 5.4.1) of type PROTOCOL_ERROR.

This leads to envoy dropping in-flight requests on the same HTTP2 connection with a 503 UC. We believe this is because of the way we do deferred reset after reading this code snippet. In other words, we believe that from envoy’s prospective, this stream exists, but since the first frame hasn’t made it to the server yet, the server believes that the stream is idle and does an active close.

Repro Steps

A very contrived example can be done using a single upstream go service coupled with a load tester and envoy.

Version details:

  • Go 1.14.2
  • envoyproxy/envoy:v1.14.1 docker image
  • Hey - 0.1.3

Configuration:

Steps:

  1. Start up a simple HTTP2 server with some delay in the response
  2. Start envoy with given configuration
  3. Run a hey loadtest - hey -n 50 -c 50 -h2 -t 3 http://localhost:10000
    • You may have to tune the parameters a little bit, but we were able to reliably cause an error message on the go program side. Error: http2: server connection error from 127.0.0.1:65100: connection error: PROTOCOL_ERROR
    • This error comes from receiving a RST_STREAM on an idle stream
    • Any connection on the same TCP connection gets dropped due to a TCP-FIN initiated by the go server

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 20 (9 by maintainers)

Most upvoted comments

Here’s a docker-compose version of the envoy example:

envoy_example.zip

If you run this along with something like hey or vegeta, you’ll reproduce the problem in Envoy version 1.17 and below. It seems to have been fixed in 1.18.

Sample Hey Command:

hey -n 50 -c 50 -h2 -t 3 http://localhost:10000

Sample Vegeta Command:

echo "GET http://localhost:10000" | vegeta attack -duration=15s  | tee results.bin | vegeta report

It would be good to know if this still happens. I know of one case where Envoy can send RST_STREAM for a stream where it hasn’t sent HEADERS but I think it can only happen before this change: https://github.com/envoyproxy/envoy/pull/14820

Since we haven’t been able to create a self-container docker-compose repro,

Tan’s example in the initial post worked for me when I reproduced it. I may have done a few trivial changes around envoy versioning and golang versioning.

HTTP2 Server - https://gist.github.com/tan-stripe/4ea1e8f3dfbb3af6210f655dc2883647 envoy.yaml - https://gist.github.com/tan-stripe/8965f551d7ffe68cdb4c4d738d143a63 Dockerfile - https://gist.github.com/tan-stripe/d6bfb04c5ad8c044442b0aefa07d898b