go: net/http: ServeFile 2X slower in Go 1.21

What version of Go are you using (go version)?

$ go version
go version go1.21rc3 linux/amd64

Does this issue reproduce with the latest release?

Yes. Slower in go1.21.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH='amd64'
GOBIN='/home/user/gocode/bin'
GOCACHE='/home/user/.cache/go-build'
GOENV='/home/user/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/user/gocode/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/user/gocode'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/user/gocode/src/github.com/golang/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/user/gocode/src/github.com/golang/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21rc3'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD=''
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3856449379=/tmp/go-build -gno-record-gcc-switches'

What did you do?

While testing Uber’s go monorepo against go1.21rc3, I noticed a large performance regression in logic that depends on http.ServeFile. git bisect points to this commit (https://github.com/golang/go/pull/56480) as the cause.

While we should expect to performance benefits from the use of io.Copy due to the ability to upgrade to io.WriterTo or io.ReaderFrom, in the common use case of http.ServeFile, the ReadFrom method that is now used instead of io.CopyN is noticeably slower for the common case of serving relatively small files. Our profiling reveals larger GC costs as a result of the new copying implementation.

Because the user doesn’t really have means of tinkering with the underlying writer or reader used by http.ServeFile they can’t reasonably work around this performance issue.

A reproducible benchmark test is attached below, we see roughly a 2x performance regression on http.ServeFile.

package main

import (
	"bufio"
	"io"
	"log"
	"net/http"
	"net/http/httptest"
	"testing"
)

func BenchmarkFileServe(b *testing.B) {
	b.StopTimer()
	ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		b.StartTimer()
		http.ServeFile(w, r, "/tmp/example") // the contents of example are attached in the issue (336 lines json)
		b.StopTimer()
	}))
	defer ts.Close()

	for i := 0; i < b.N; i++ {
		res, err := http.Get(ts.URL)
		if err != nil {
			log.Fatal(err)
		}
		// note: we see a noticeable performance regression with both allRead
		// and scanRead.
		err = allRead(res.Body)
		// err = scanRead(res.Body)
		res.Body.Close()
		if err != nil {
			log.Fatal(err)
		}
	}
}

func scanRead(r io.Reader) error {
	scanner := bufio.NewScanner(r)
	for scanner.Scan() { /* scan */ }
	return scanner.Err()
}


func allRead(r io.Reader) error {
	_, err := io.ReadAll(r)
	return err
}

Contents of /tmp/example we used in the benchmark 336 line example JSON
[
  {
    "latlng": {
      "lat": 0,
      "lng": 0
    },
    "region": {
      "name": "gopherland",
      "id": 1
    },
    "golang": {
      "namespace": "gopherland",
      "ids": [
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers"
      ]
    }
  },
  {
    "latlng": {
      "lat": 0,
      "lng": 0
    },
    "region": {
      "name": "gopherland",
      "id": 1
    },
    "golang": {
      "namespace": "gopherland",
      "ids": [
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers"
      ]
    }
  }
]

in go1.20.5

$ go test -bench=.
goos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
BenchmarkFileServe-96              17074             69007 ns/op
PASS
ok      exp/fileserver       10.056s

in go1.21rc2 (go1.21rc3 test times out for me after 660s so I can’t get a clean result).

$ go test -bench=.
oos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
BenchmarkFileServe-96              10000            126070 ns/op
PASS
ok      exp/fileserver       416.290s

benchstat

goos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
             │   /tmp/rs0   │             /tmp/rs1             │
             │    sec/op    │    sec/op      vs base           │
FileServe-96   69.01µ ± ∞ ¹   126.07µ ± ∞ ¹  ~ (p=1.000 n=1) ²

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 3
  • Comments: 29 (26 by maintainers)

Commits related to this issue

Most upvoted comments

The problem is not io.Copy but, surprisingly, sendFile.

io.CopyN uses io.Copy with size hint (io.LimitReader) https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/io/io.go#L365

io.Copy uses internal copyBuffer https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/io/io.go#L388-L390

that checks whether response writer (dst) implements io.ReaderFrom https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/io/io.go#L414-L417

which it does https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/http/server.go#L575-L578

It sniffs first 512 bytes (note for later) https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/http/server.go#L591-L601

and then (if content length is set to disable chunked encoding) uses underlying *TCPConn.ReadFrom https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/http/server.go#L606-L612

TCPConn.ReadFrom then (in my case) uses sendFile https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/tcpsock_posix.go#L47-L55

sendFile uses size hint (io.LimitReader) passed by io.CopyN to adjust remain from huge number to (file size - 512 due to sniffing) https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/sendfile_linux.go#L20-L33

and this huge remain in case of io.Copy is what causes sendFile slowdown https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/sendfile_linux.go#L40-L44

The problem could be reproduced by the following benchmark:

package go61530

import (
	"bytes"
	"fmt"
	"io"
	"log"
	"net/http"
	"net/http/httptest"
	"os"
	"os/exec"
	"strconv"
	"testing"
)

// A benchmark for profiling the server without the HTTP client code.
// The client code runs in a subprocess.
//
// For use like:
//
//	$ go test ./*.go -c && ./go61530.test -test.run=NONE -test.bench=BenchmarkServerIOCopy$ -test.count=10
func BenchmarkServerIOCopy(b *testing.B) {
	b.ReportAllocs()
	b.StopTimer()

	const size = 1024

	// Child process mode;
	if url := os.Getenv("TEST_BENCH_SERVER_URL"); url != "" {
		n, err := strconv.Atoi(os.Getenv("TEST_BENCH_CLIENT_N"))
		if err != nil {
			panic(err)
		}
		for i := 0; i < n; i++ {
			res, err := http.Get(url)
			if err != nil {
				log.Panicf("Get: %v", err)
			}
			if res.StatusCode != 200 {
				log.Panicf("Status: %d", res.StatusCode)
			}
			n, err := io.Copy(io.Discard, res.Body)
			if err != nil {
				log.Panicf("Copy: %v", err)
			}
			if n != size {
				log.Panicf("Wrong size: %d", n)
			}
			res.Body.Close()
		}
		os.Exit(0)
		return
	}

	file := b.TempDir() + "/foo"
	err := os.WriteFile(file, bytes.Repeat([]byte{0}, size), 0666)
	if err != nil {
		b.Fatal(err)
	}

	ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		// Set Content-Length to disable chunked encoding and allow w.ReadFrom
		w.Header().Set("Content-Length", strconv.FormatInt(size, 10))

		f, err := os.Open(file)
		if err != nil {
			b.Fatal(err)
		}
		defer f.Close()

		//io.Copy(w, f) // slow
		io.CopyN(w, f, size) // fast
	}))
	defer ts.Close()

	cmd := exec.Command(os.Args[0], "-test.run=XXXX", "-test.bench=BenchmarkServerIOCopy$")
	cmd.Env = append([]string{
		fmt.Sprintf("TEST_BENCH_CLIENT_N=%d", b.N),
		fmt.Sprintf("TEST_BENCH_SERVER_URL=%s", ts.URL),
	}, os.Environ()...)

	b.StartTimer()
	out, err := cmd.CombinedOutput()
	if err != nil {
		b.Errorf("Test failure: %v, with output: %s", err, out)
	}
}

With debug print

diff --git a/src/net/sendfile_linux.go b/src/net/sendfile_linux.go
index 9a7d005803..4934621fee 100644
--- a/src/net/sendfile_linux.go
+++ b/src/net/sendfile_linux.go
@@ -39,6 +39,7 @@ func sendFile(c *netFD, r io.Reader) (written int64, err error, handled bool) {
 
        var werr error
        err = sc.Read(func(fd uintptr) bool {
+               println(remain)
                written, werr, handled = poll.SendFile(&c.pfd, int(fd), remain)
                return true
        })

io.Copy version prints 9223372036854775807 and io.CopyN prints 512

Fair point about CL 305229. I declined it because it seemed to me that we didn’t have the data to support doing the extra system call. I guess today I would still like to understand what is going on in the kernel to cause the slowdowns.

Thanks for the really nice analysis @AlexanderYastrebov.

I think reverting https://go.dev/cl/446276 is the right thing for 1.21. Whatever the root problem with sendfile is, it’s subtle and I don’t want to be trying to figure out the right subtle use of it this close to release.

I was also wondering why it uses sendFile instead of splice: https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/tcpsock_posix.go#L47-L55

and it turns out splice is not implemented for file-to-socket case: https://github.com/golang/go/blob/2eca0b1e1663d826893b6b1fd8bd89da98e65d1e/src/net/splice_linux.go#L27-L37

I dont know much about this low-level stuff, likely there are reasons why it is not implemented yet.

Nevertheless I tried a trivial change:

diff --git a/src/net/splice_linux.go b/src/net/splice_linux.go
index ab2ab70b28..819932374b 100644
--- a/src/net/splice_linux.go
+++ b/src/net/splice_linux.go
@@ -7,6 +7,7 @@ package net
 import (
        "internal/poll"
        "io"
+       "os"
 )
 
 // splice transfers data from r to c using the splice system call to minimize
@@ -24,19 +25,21 @@ func splice(c *netFD, r io.Reader) (written int64, err error, handled bool) {
                }
        }
 
-       var s *netFD
+       var spfd *poll.FD
        if tc, ok := r.(*TCPConn); ok {
-               s = tc.fd
+               spfd = &tc.fd.pfd
        } else if uc, ok := r.(*UnixConn); ok {
                if uc.fd.net != "unix" {
                        return 0, nil, false
                }
-               s = uc.fd
+               spfd = &uc.fd.pfd
+       } else if f, ok := r.(*os.File); ok {
+               spfd = f.PollFD()
        } else {
                return 0, nil, false
        }
 
-       written, handled, sc, err := poll.Splice(&c.pfd, &s.pfd, remain)
+       written, handled, sc, err := poll.Splice(&c.pfd, spfd, remain)
        if lr != nil {
                lr.N -= written
        }
diff --git a/src/os/file_unix.go b/src/os/file_unix.go
index 533a48404b..9e7440f218 100644
--- a/src/os/file_unix.go
+++ b/src/os/file_unix.go
@@ -93,6 +93,10 @@ func (f *File) Fd() uintptr {
        return uintptr(f.pfd.Sysfd)
 }
 
+func (f *File) PollFD() *poll.FD {
+       return &f.pfd
+}
+

and reproducer bechmark with io.Copy (https://github.com/golang/go/issues/61530#issuecomment-1646994568) does not seem to suffer from the problem anymore and shows numbers on par with io.CopyN+sendFile

Maybe @ianlancetaylor could share some insight here.

@AlexanderYastrebov thanks for the great investigative work! I totally overlooked this case in my original PR’s benchmarks 😓

I tested CL 512235 and used it here https://github.com/golang/go/issues/61530#issuecomment-1646701847 benchmark

Before CL

BenchmarkFileServe-4 8014 126864 ns/op 54669 B/op 16 allocs/op

After CL

BenchmarkFileServe-4 9562 109095 ns/op 28008 B/op 17 allocs/op

Robots seem to be having a slow day, the CL mentioned immediately above “Fixes #61530”, and that was its intent, so I am closing.

I’ll reopen this for @neild to close after cherry-picking CL 512615 to the release branch. (Per this thread we don’t need a backport issue.)

Is there a new ticket to follow the underlying work on io.Copy since this one has been closed?

(The comments here imply there’s more work to be done, but gopherbot closed this ticket)

@mauri870 You can close the PR and it will abandon CL