go: runtime: unaligned jumps causing performance regression on Intel
What version of Go are you using (go version
)?
λ go version go version go1.13 windows/amd64
And Go 1.14-RC1.
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
go env
Output
λ go env set GO111MODULE= set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\klaus\AppData\Local\go-build set GOENV=C:\Users\klaus\AppData\Roaming\go\env set GOEXE=.exe set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=c:\gopath set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=c:\go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=c:\go\pkg\tool\windows_amd64 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD= set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=c:\temp\wintemp\go-build453787042=/tmp/go-build -gno-record-gcc-switches
What did you do?
Isolated code: reproducer.zip
go test -bench=. -test.benchtime=10s
Most of the code is needed for the test setup, only (*tokens).EstimatedBits
and mFastLog2
is run during the benchmark.
λ benchcmp go113.txt go114.txt
benchmark old ns/op new ns/op delta
Benchmark_tokens_EstimatedBits-12 663 716 +7.99%
benchmark old MB/s new MB/s speedup
Benchmark_tokens_EstimatedBits-12 1.51 1.40 0.93x
What did you expect to see?
Equivalent performance.
What did you see instead?
8% performance regression.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (19 by maintainers)
Commits related to this issue
- Added klauspost https://github.com/golang/go/issues/37121 — committed to dr2chase/benchmarks by dr2chase 4 years ago
The remaining performance issue is related to unaligned jumps. We will not be making changes to our jumps before the 1.14 release, but likely will address it in a point release.
I’ll remove the
release-blocker
label, and update the milestone to 1.15. Please comment if I am mistaken.For convenience, here’s the benchmark (without testdata and other files in .zip):
token.go (with tokens.EstimatedBits code)
/cc @aclements @randall77 @ianlancetaylor @mknyszek
Okay, now it’s closed. There’s a 0.3% slowdown (2 parts in 673), which is close enough.
Oops, no, because of a finger-fumble that is not correct. Shut down Chrome and the IDE, try again.
It might be worthwhile quickly eliminating #35881 as the source of the regression. Is your CPU listed in the description of #35881? Are you running a recent version of the microcode? If so, you could try
I get a startling regression on a Mac laptop – 70% slowdown – will look further to see if I am making some obvious mistake.