go: cmd/compile: Go 1.19 might make generic types slower

What version of Go are you using (go version)?

$ go version
go version go1.19 linux/amd64

Does this issue reproduce with the latest release?

It reproduces on the 1.19

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ksco/.cache/go-build"
GOENV="/home/ksco/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/ksco/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/ksco/go"
GOPRIVATE=""
GOPROXY="https://goproxy.cn,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3292360144=/tmp/go-build -gno-record-gcc-switches"

What did you do?

playground: https://go.dev/play/p/6UxOn6pftVs

main_test.go

package main

import (
	"testing"
)

func addInt(a, b int) int {
	return a + b
}

func addString(a, b string) string {
	return a + b
}

func BenchmarkWithoutGenerics(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = addInt(1, 2)
		_ = addString("foo", "bar")
	}
}

type Addable interface {
	int | string
}

func add[T Addable](a, b T) T {
	return a + b
}

func BenchmarkWithGenerics(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = add(1, 2)
		_ = add("foo", "bar")
	}
}

What did you expect to see?

1.19 should be similar to the 1.18 benchmark test, not much slower.

What did you see instead?

goversion: 1.19

goos: linux
goarch: amd64
pkg: generics/ch13
cpu: AMD Ryzen 7 5800H with Radeon Graphics         
BenchmarkWithoutGenerics
BenchmarkWithoutGenerics-16    	71445846	        16.49 ns/op
BenchmarkWithGenerics
BenchmarkWithGenerics-16       	32776173	        36.30 ns/op
PASS

goversion: 1.18

goos: linux
goarch: amd64
pkg: generics/ch13
cpu: AMD Ryzen 7 5800H with Radeon Graphics         
BenchmarkWithoutGenerics
BenchmarkWithoutGenerics-16    	64931702	        16.91 ns/op
BenchmarkWithGenerics
BenchmarkWithGenerics-16       	70497928	        17.01 ns/op
PASS

The assembly generated by 1.19 is found to have more

main..dict.add[int](SB), AX main..dict.add[string](SB), AX

These lines are directly inlined in 1.18.

/cc @ianlancetaylor

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 7
  • Comments: 17 (9 by maintainers)

Most upvoted comments

@kscooo The benchmarks I posted are for what will eventually become Go 1.20. We’re not going to make any changes to Go 1.19, sorry.

@ianlancetaylor I would like to know the answer to these questions in order to educate myself first so I can educate others that I have seen with the same questions.

  1. Is there a max compilation time that has been imposed on the compiler team? Either by Google, the community or anyone with the power to say, hey, Go cannot take over X (milli)seconds to compile ?
  2. Wouldn’t be great to sacrifice some compilation time for much better execution time?
  3. Would it be possible to separate dev build from production build and have a way more aggressive (optimization) compiler (and slower) in order to give us much more performance? I mean, I am sure that if a survey is done in the community people will prefer to have a slower production build that yields better performance. We don’t build for production every hour. Many of us already have to wait many minutes for a CI build to finish running in order to release anything. I don’t think having a slower Go compiler will hurt anything. If anything, it will just bring benefits. I think this is totally doable because so far you guys managed to have a very fast language that compiles at the speed that no other. Right now the compilation time is faster than the time we are able to even attempt to test anything.
  1. No.
  2. There are always tradeoffs.
  3. That is not a direction that we plan to go in.

This was caused by https://go-review.googlesource.com/c/go/+/395854 . It was a deliberate change to avoid inlining in cases where we were running into bugs. I think we will leave this as is for 1.19. unified IR should fix this for 1.20.

Here is an article by Vincent Marti that talks about this issue in details: https://planetscale.com/blog/generics-can-make-your-go-code-slower

Here is a snippet of this summary: DO NOT pass an interface to a generic function, under any circumstances. Because of the way shape instantiation works for interfaces, instead of de-virtualizing, you’re adding another virtualization layer that involves a global hash table lookup for every method call. When dealing with Generics in a performance-sensitive context, use only pointers instead of interfaces.

I wonder if we could have a more performance-oriented go compiler. I, and am sure many others would trade a few seconds of compilation for production with the hope of getting a super optimized binary. Heck, I would wait an entire minute for it. Go has the potential to match the performance that Rust offers, even with the GC on, but some internal force is always somehow opposing that.

The problem you mentioned suggests opening a new thread, it’s not the same problem as the one I mentioned, what you quoted is trying to show that the overhead of dynamic lookup methods is expensive, this problem is simply a collection of types (according to the previous reply, 1.20 will be fixed)

It’s not a mystery: the internal opposing force is compilation time.

Here is an article by Vincent Marti that talks about this issue in details: https://planetscale.com/blog/generics-can-make-your-go-code-slower

Here is a snippet of this summary: DO NOT pass an interface to a generic function, under any circumstances. Because of the way shape instantiation works for interfaces, instead of de-virtualizing, you’re adding another virtualization layer that involves a global hash table lookup for every method call. When dealing with Generics in a performance-sensitive context, use only pointers instead of interfaces.

I wonder if we could have a more performance-oriented go compiler. I, and am sure many others would trade a few seconds of compilation for production with the hope of getting a super optimized binary. Heck, I would wait an entire minute for it. Go has the potential to match the performance that Rust offers, even with the GC on, but some internal force is always somehow opposing that.

Since unified IR is expected to fix this, I’m going to assign this @mdempsky for visibility, but feel free to unassign.