go: x/net/http2: investigate data race in Transport Request.Body reuse when GetBody is not defined
What version of Go are you using (go version)?
$ go version go version go1.10.1 linux/amd64
Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (go env)?
GOARCH=“amd64” GOBIN=“” GOCACHE=“/home/test/.cache/go-build” GOEXE=“” GOHOSTARCH=“amd64” GOHOSTOS=“linux” GOOS=“linux” GORACE=“” GOROOT=“/home/test/go” GOTMPDIR=“” GOTOOLDIR=“/home/test/go/pkg/tool/linux_amd64” GCCGO=“gccgo” CC=“gcc” CXX=“g++” CGO_ENABLED=“1” CGO_CFLAGS=“-g -O2” CGO_CPPFLAGS=“” CGO_CXXFLAGS=“-g -O2” CGO_FFLAGS=“-g -O2” CGO_LDFLAGS=“-g -O2” PKG_CONFIG=“pkg-config” GOGCCFLAGS=“-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build614182611=/tmp/go-build -gno-record-gcc-switches”
and nginx version: nginx/1.13.3
What did you do?
Here is my client:
package main
import (
"fmt"
"net/http"
"sync"
"time"
"errors"
"bytes"
"io/ioutil"
)
type Client struct {
urlBase string
httpClient *http.Client
}
func NewClient(url string) *Client {
cc := &Client{
urlBase: url,
httpClient: &http.Client{Transport: &http.Transport{MaxIdleConnsPerHost: 100, MaxIdleConns: 100, IdleConnTimeout: 90 * time.Second, ExpectContinueTimeout: 5 * time.Second}},
}
cc.httpClient.Timeout = 5 * time.Second
return cc
}
func (self *Client) Send(content []byte) error {
if content == nil {
return errors.New("nil content")
}
body := &bytes.Buffer{}
body.Write(content)
begin := time.Now()
req, err := http.NewRequest("POST", self.urlBase, body)
if err != nil {
return fmt.Errorf("Send %+v (from %v to %v)", err, begin, time.Now())
}
req.Header = make(http.Header)
req.Header.Set("Content-Type", "application/json")
begin = time.Now()
resp, err := self.httpClient.Do(req)
if err != nil {
return fmt.Errorf("Send %+v (from %v to %v)", err, begin, time.Now())
}
defer resp.Body.Close()
if resp.StatusCode >= http.StatusBadRequest {
bs, _ := ioutil.ReadAll(resp.Body)
return fmt.Errorf("Send : HTTP : %d , BODY : %s , REQ : %s", resp.StatusCode, string(bs), string(content))
}
bs, err := ioutil.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("Send : ERR : %v", err)
}
fmt.Println("Send : Resp : ", string(bs))
return nil
}
func main() {
client := NewClient(
"https://test.cn", // I use /etc/hosts to bind the domain
)
var content = []byte(`{"test":1}`)
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
client.Send(content)
}()
}
wg.Wait()
}
And here is my nginx configure of http2:
http2_max_concurrent_streams 40960;
http2_max_requests 3;
http2_max_field_size 8k;
http2_max_header_size 32k;
http2_recv_buffer_size 1024k;
http2_recv_timeout 120s;
What did you expect to see?
I expect all requests are successfully.
What did you see instead?
But I got 400 occasionally and found client prematurely closed stream: only 0 out of 10 bytes of request body received line in nginx log.
I used GODEBUG to try to figure out, and found data not sent after END_HEADERS:
2018/04/23 18:32:25 http2: Transport encoding header ":method" = "POST"
2018/04/23 18:32:25 http2: Transport encoding header ":path" = "/"
2018/04/23 18:32:25 http2: Transport encoding header ":scheme" = "https"
2018/04/23 18:32:25 http2: Transport encoding header "content-type" = "application/json"
2018/04/23 18:32:25 http2: Transport encoding header "content-length" = "10"
2018/04/23 18:32:25 http2: Transport encoding header "accept-encoding" = "gzip"
2018/04/23 18:32:25 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2018/04/23 18:32:25 http2: Framer 0xc4205480e0: wrote HEADERS flags=END_HEADERS stream=5 len=8
2018/04/23 18:32:25 http2: Framer 0xc4205480e0: wrote DATA flags=END_STREAM stream=5 len=0 data=""
That’s the reason why nginx responsed 400 directly.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 69 (60 by maintainers)
Commits related to this issue
- http2: use GetBody unconditionally on Transport retry, when available We were previously only using the new-ish Request.GetBody to "rewind" a Request.Body on retry when it seemed that we hadn't start... — committed to golang/net by bradfitz 6 years ago
- net/http: update bundled http2 Updates bundled x/net/http2 to git rev cffdcf67 for: http2: use GetBody unconditionally on Transport retry, when available https://golang.org/cl/123476 ht... — committed to golang/go by bradfitz 6 years ago
I can confirm that my go-issue-25009 repro still fails on go1.12 and go1.13 for some reason. Seems like issue is not fixed (or there is a regression), I will investigate it further.
@odeke-em I’ve created an example via docker-compose. Please take a look at https://github.com/ernado/go-issue-25009
Hi @ernado,
I am still seeing this issue. I have tested on both Go 1.12.9 and 1.13
It seems that others are as well in https://github.com/golang/go/issues/32441
Gentle ping for @bradfitz @ianlancetaylor Please take a look, seems like we are close to resolving that issue.
I agree that you found the bug. The fix while it works doesn’t seem right. @bradfitz : The bug is when http2 fails to obtain a connection and leaves http2/transport.go(RoundTripOpt:389), the new request is not returned to be retried from http/transport.go (roundtrip:463) instead the original one is used with a body that has already been read.
Something to consider is the way h2 is integrated with h1 in the std lib to handle upgrades (I’m not used to this setup as I mostly use h2 lib directly). In the configureTransport we set the h2 transport with a special version of the pool named noDialClientConnPool which is not able to dial on call to
GetClientConn. Instead, new connections are only added to the pool on upgrade, by promoting an existing TLS connection created by the h1 Transport.I’m not sure what happen in the case this connection is becoming invalid while the request is being sent. As seen, h2 may decide to retry, but if the pool does not contain a valid connection anymore and is not able to dial, you will get the
http2: no cached connection was availableyou are seeing. I guess the proper way to retry would be to go back to h1 and go over the full upgrade process all over again.So basically, without
GetBodyset on request, the request can’t be retried and when set, it can’t either because it can’t reconnect, giving a different error. Nginx might send400back because Go decides to not send theDATAframes after theHEADERSbecause it received aGOAWAYframe in the meantime and failed to properly cancel the stream. Note that I did not verify those assumptions, I’m just interpreting what I’m reading in the code.Thank you @ictar for the report, we’ll investigate it.
@agnivade thank you for handling this. I just wanted to add that I can relate to @ictar’s need to have nginx for a repro(many companies that I know heavily use it) and that the question of absolving the issue only because Go’s servers can’t reproduce it might be tricky. This is because many companies will usually put nginx or some other web server as their frontend to their web apps, and from there proxy the content to polyglot services on their backend e.g Go, Node.js, Python(etc) web server. Interoperability between protocols and servers matters a lot and discrepancies between Go clients talking to various servers creep in 😃