go: x/net/http2: connection-level flow control not returned if stream errors, causes server hang

What version of Go are you using (go version)?

$ go version
go version go1.14.2 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What’s the issue

A hang was reported between a gRPC client (grpc-go v1.27.0) hitting a gRPC server in one of our production environments. The client and server are both running on the same host. I captured a core dump of the client and server code to analyze with delve. I noticed that the google.golang.org/grpc/internal/transport.loopyWriter.cbuf.sendQuota was 0 in the client code, which indicates that the client’s connection-level send window had run out and was at 0. In the server’s core dump, I tracked down the corresponding http2.serverConn and noticed that it’s serverConn.inflow.n was set to 0 too. I then tracked down the two places in http2/server.go that call inflow.take and noticed what I believe is the issue in processData:

func (sc *serverConn) processData(f *DataFrame) error {
	...

	if f.Length > 0 {
		// Check whether the client has flow control quota.
		if st.inflow.available() < int32(f.Length) {
			return streamError(id, ErrCodeFlowControl)
		}
		st.inflow.take(int32(f.Length))

		if len(data) > 0 {
			wrote, err := st.body.Write(data)
			if err != nil {
				return streamError(id, ErrCodeStreamClosed)
			}
			if wrote != len(data) {
				panic("internal error: bad Writer")
			}
			st.bodyBytes += int64(len(data))
		}

		// Return any padded flow control now, since we won't
		// refund it later on body reads.
		if pad := int32(f.Length) - int32(len(data)); pad > 0 {
			sc.sendWindowUpdate32(nil, pad)
			sc.sendWindowUpdate32(st, pad)
		}
	}
...

In this code, st.inflow.take is called, but if st.body.Write returns an error then the flow control is not refunded to the client since the code bails and returns a streamError (nor is it added to the st.body’s pipeBuffer since pipe.Write returns immediately if it has an error to return).

Side note: st.body.Write may return an error if st.body.Close is called. The server which had this issue is using grpc-go’s serverHandlerTransport which does, in fact, call req.Body.Close (see here). A gRPC bi-directional streaming endpoint is running between the client and server, and what i suspect is happening is the client is sending the server data over the bi-di stream while an error happens in the gRPC server that causes the request to end, and therefore req.Body.Close to be called while data is in flight.

Here’s what I think a possible fix to net/http2 could look like:

diff --git a/http2/server.go b/http2/server.go
index 01f4ecc..ba3ebd1 100644
--- a/http2/server.go
+++ b/http2/server.go
@@ -1650,6 +1650,7 @@ func (sc *serverConn) processData(f *DataFrame) error {
                if len(data) > 0 {
                        wrote, err := st.body.Write(data)
                        if err != nil {
+                               sc.sendWindowUpdate32(nil, int32(f.Length))
                                return streamError(id, ErrCodeStreamClosed)
                        }
                        if wrote != len(data) {

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 34 (4 by maintainers)

Commits related to this issue

Most upvoted comments

Change https://golang.org/cl/264058 mentions this issue: [release-branch.go1.15-bundle] net/http2: send WINDOW_UPDATE on a body's write failure

Thank you @jared2501 and @networkimprov for raising it! So this is a tricky one to backport given that it is from x/net/http2 that then needs to be bundled into net/http/h2_bundle.go, so to begin with: 1.) I have mailed out CL 258359 firstly to net/http/h2_bundle.go 2.) After that merge, I’ll make backports to Go1.14 and Go1.15 and after that merge, we’ll be good to go. However, please note that those issues are still “CherryPickCandidate” labelled, thus will only be released if approved and when the releases are being cut.

Thank you for the ping @networkimprov 😃 I’ve provided a review after Ian’s and should be ready to roll out soon, and then we’ll need to pull it into net/http/h2_bundle.go

@gopherbot Please open backport issues.

This problem can reportedly cause an HTTP/2 connection to hang. There doesn’t seem to be any reasonable workaround.

Change https://golang.org/cl/245158 mentions this issue: net/http2: send WINDOW_UPDATE on a body's write failure