go: net/http: ServeFile 2X slower in Go 1.21
What version of Go are you using (go version
)?
$ go version go version go1.21rc3 linux/amd64
Does this issue reproduce with the latest release?
Yes. Slower in go1.21.
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env GOARCH='amd64' GOBIN='/home/user/gocode/bin' GOCACHE='/home/user/.cache/go-build' GOENV='/home/user/.config/go/env' GOEXE='' GOEXPERIMENT='' GOFLAGS='' GOHOSTARCH='amd64' GOHOSTOS='linux' GOINSECURE='' GOMODCACHE='/home/user/gocode/pkg/mod' GONOPROXY='' GONOSUMDB='' GOOS='linux' GOPATH='/home/user/gocode' GOPRIVATE='' GOPROXY='https://proxy.golang.org,direct' GOROOT='/home/user/gocode/src/github.com/golang/go' GOSUMDB='sum.golang.org' GOTMPDIR='' GOTOOLCHAIN='auto' GOTOOLDIR='/home/user/gocode/src/github.com/golang/go/pkg/tool/linux_amd64' GOVCS='' GOVERSION='go1.21rc3' GCCGO='gccgo' GOAMD64='v1' AR='ar' CC='gcc' CXX='g++' CGO_ENABLED='1' GOMOD='' GOWORK='' CGO_CFLAGS='-O2 -g' CGO_CPPFLAGS='' CGO_CXXFLAGS='-O2 -g' CGO_FFLAGS='-O2 -g' CGO_LDFLAGS='-O2 -g' PKG_CONFIG='pkg-config' GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3856449379=/tmp/go-build -gno-record-gcc-switches'
What did you do?
While testing Uber’s go monorepo against go1.21rc3, I noticed a large performance regression in logic that depends on http.ServeFile. git bisect points to this commit (https://github.com/golang/go/pull/56480) as the cause.
While we should expect to performance benefits from the use of io.Copy due to the ability to upgrade to io.WriterTo or io.ReaderFrom, in the common use case of http.ServeFile, the ReadFrom method that is now used instead of io.CopyN is noticeably slower for the common case of serving relatively small files. Our profiling reveals larger GC costs as a result of the new copying implementation.
Because the user doesn’t really have means of tinkering with the underlying writer or reader used by http.ServeFile they can’t reasonably work around this performance issue.
A reproducible benchmark test is attached below, we see roughly a 2x performance regression on http.ServeFile.
package main
import (
"bufio"
"io"
"log"
"net/http"
"net/http/httptest"
"testing"
)
func BenchmarkFileServe(b *testing.B) {
b.StopTimer()
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
b.StartTimer()
http.ServeFile(w, r, "/tmp/example") // the contents of example are attached in the issue (336 lines json)
b.StopTimer()
}))
defer ts.Close()
for i := 0; i < b.N; i++ {
res, err := http.Get(ts.URL)
if err != nil {
log.Fatal(err)
}
// note: we see a noticeable performance regression with both allRead
// and scanRead.
err = allRead(res.Body)
// err = scanRead(res.Body)
res.Body.Close()
if err != nil {
log.Fatal(err)
}
}
}
func scanRead(r io.Reader) error {
scanner := bufio.NewScanner(r)
for scanner.Scan() { /* scan */ }
return scanner.Err()
}
func allRead(r io.Reader) error {
_, err := io.ReadAll(r)
return err
}
Contents of /tmp/example we used in the benchmark
336 line example JSON
[ { "latlng": { "lat": 0, "lng": 0 }, "region": { "name": "gopherland", "id": 1 }, "golang": { "namespace": "gopherland", "ids": [ "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers" ] } }, { "latlng": { "lat": 0, "lng": 0 }, "region": { "name": "gopherland", "id": 1 }, "golang": { "namespace": "gopherland", "ids": [ "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers", "i-like-gophers" ] } } ]
in go1.20.5
$ go test -bench=.
goos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
BenchmarkFileServe-96 17074 69007 ns/op
PASS
ok exp/fileserver 10.056s
in go1.21rc2 (go1.21rc3 test times out for me after 660s so I can’t get a clean result).
$ go test -bench=.
oos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
BenchmarkFileServe-96 10000 126070 ns/op
PASS
ok exp/fileserver 416.290s
benchstat
goos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
│ /tmp/rs0 │ /tmp/rs1 │
│ sec/op │ sec/op vs base │
FileServe-96 69.01µ ± ∞ ¹ 126.07µ ± ∞ ¹ ~ (p=1.000 n=1) ²
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 3
- Comments: 29 (26 by maintainers)
Commits related to this issue
- net/http: performance regression on http.ServeFile The change to io.Copy in servefile caused a exponential increase in memory allocations if the reader is a os.File. │ sec/op │ s... — committed to mauri870/go by mauri870 a year ago
- net/http: performance regression on http.ServeFile The change to io.Copy in servefile caused a exponential increase in memory allocations if the reader is a os.File. │ sec/op │ s... — committed to mauri870/go by mauri870 a year ago
- net/http: performance regression on http.ServeFile The change to io.Copy in servefile caused a exponential increase in memory allocations if the reader is a os.File. ``` │ sec/op │... — committed to mauri870/go by mauri870 a year ago
- [release-branch.go1.21] Revert "net/http: use Copy in ServeContent if CopyN not needed" This reverts CL 446276. Reason for revert: Causing surprising performance regression. Fixes #61530 Change-Id... — committed to golang/go by neild a year ago
The problem is not io.Copy but, surprisingly, sendFile.
io.CopyN uses io.Copy with size hint (io.LimitReader) https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/io/io.go#L365
io.Copy uses internal copyBuffer https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/io/io.go#L388-L390
that checks whether response writer (dst) implements io.ReaderFrom https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/io/io.go#L414-L417
which it does https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/http/server.go#L575-L578
It sniffs first 512 bytes (note for later) https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/http/server.go#L591-L601
and then (if content length is set to disable chunked encoding) uses underlying *TCPConn.ReadFrom https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/http/server.go#L606-L612
TCPConn.ReadFrom then (in my case) uses sendFile https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/tcpsock_posix.go#L47-L55
sendFile uses size hint (io.LimitReader) passed by io.CopyN to adjust remain from huge number to (file size - 512 due to sniffing) https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/sendfile_linux.go#L20-L33
and this huge remain in case of io.Copy is what causes sendFile slowdown https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/sendfile_linux.go#L40-L44
The problem could be reproduced by the following benchmark:
With debug print
io.Copy version prints 9223372036854775807 and io.CopyN prints 512
Fair point about CL 305229. I declined it because it seemed to me that we didn’t have the data to support doing the extra system call. I guess today I would still like to understand what is going on in the kernel to cause the slowdowns.
Thanks for the really nice analysis @AlexanderYastrebov.
I think reverting https://go.dev/cl/446276 is the right thing for 1.21. Whatever the root problem with sendfile is, it’s subtle and I don’t want to be trying to figure out the right subtle use of it this close to release.
I was also wondering why it uses sendFile instead of splice: https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/tcpsock_posix.go#L47-L55
and it turns out splice is not implemented for file-to-socket case: https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/splice_linux.go#L27-L37
I dont know much about this low-level stuff, likely there are reasons why it is not implemented yet.
Nevertheless I tried a trivial change:
and reproducer bechmark with io.Copy (https://github.com/golang/go/issues/61530#issuecomment-1646994568) does not seem to suffer from the problem anymore and shows numbers on par with io.CopyN+sendFile
Maybe @ianlancetaylor could share some insight here.
@AlexanderYastrebov thanks for the great investigative work! I totally overlooked this case in my original PR’s benchmarks 😓
I tested CL 512235 and used it here https://github.com/golang/go/issues/61530#issuecomment-1646701847 benchmark
Before CL
BenchmarkFileServe-4 8014 126864 ns/op 54669 B/op 16 allocs/op
After CL
BenchmarkFileServe-4 9562 109095 ns/op 28008 B/op 17 allocs/op
Robots seem to be having a slow day, the CL mentioned immediately above “Fixes #61530”, and that was its intent, so I am closing.
I’ll reopen this for @neild to close after cherry-picking CL 512615 to the release branch. (Per this thread we don’t need a backport issue.)
Is there a new ticket to follow the underlying work on io.Copy since this one has been closed?
(The comments here imply there’s more work to be done, but gopherbot closed this ticket)
@mauri870 You can close the PR and it will abandon CL
@costela FYI