go: runtime: frequent enlisting of short-lived background workers leads to performance regression with async preemption

What version of Go are you using (go version)?

λ go version
go version go1.14 windows/amd64

λ go version
go version go1.13.8 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
λ go env                                                                                                                                       
set GO111MODULE=                                                                                                                               
set GOARCH=amd64                                                                                                                               
set GOBIN=                                                                                                                                     
set GOCACHE=C:\Users\klaus\AppData\Local\go-build                                                                                              
set GOENV=C:\Users\klaus\AppData\Roaming\go\env                                                                                                
set GOEXE=.exe                                                                                                                                 
set GOFLAGS=                                                                                                                                   
set GOHOSTARCH=amd64                                                                                                                           
set GOHOSTOS=windows                                                                                                                           
set GONOPROXY=                                                                                                                                 
set GONOSUMDB=                                                                                                                                 
set GOOS=windows                                                                                                                               
set GOPATH=e:\gopath                                                                                                                           
set GOPRIVATE=                                                                                                                                 
set GOPROXY=https://goproxy.io                                                                                                                 
set GOROOT=c:\go                                                                                                                               
set GOSUMDB=sum.golang.org                                                                                                                     
set GOTMPDIR=                                                                                                                                  
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64                                                                                                     
set GCCGO=gccgo                                                                                                                                
set AR=ar                                                                                                                                      
set CC=gcc                                                                                                                                     
set CXX=g++                                                                                                                                    
set CGO_ENABLED=1                                                                                                                              
set GOMOD=                                                                                                                                     
set CGO_CFLAGS=-g -O2                                                                                                                          
set CGO_CPPFLAGS=                                                                                                                              
set CGO_CXXFLAGS=-g -O2                                                                                                                        
set CGO_FFLAGS=-g -O2                                                                                                                          
set CGO_LDFLAGS=-g -O2                                                                                                                         
set PKG_CONFIG=pkg-config                                                                                                                      
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=d:\temp\wintemp\go-build155272862=/tmp/go-build -gno-record-gcc-switches   

What did you do?

Run benchmark: https://play.golang.org/p/WeuJg6yaOuJ

go test -bench=. -test.benchtime=10s used to test.

What did you expect to see?

Close or similar benchmark speeds.

What did you see instead?

40% performance regression.

λ benchcmp go113.txt go114.txt
benchmark                                       old ns/op     new ns/op     delta
BenchmarkCompressAllocationsSingle/flate-32     87026         121741        +39.89%
BenchmarkCompressAllocationsSingle/gzip-32      88654         122632        +38.33%

This is not a purely theoretical benchmark. While suboptimal, this is the easiest way to compress a piece of data, so this will be seen in the wild. It could also indicate a general regression for applications allocating a lot.

Edit: This is not related to changes in the referenced packages. Seeing this when using packages outside the stdlib as well.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 9
  • Comments: 40 (36 by maintainers)

Most upvoted comments

@mknyszek Well, what I think is that I think someone should review https://golang.org/cl/216198 which seems to already do what you want.

Unfortunately, CL 223797 still has some lock ordering issues, so we’ve decided it’s safer to bump this to 1.17.

Change https://golang.org/cl/223797 mentions this issue: runtime: prefer to wake an idle P when enlisting bg mark workers

@klauspost That’s true, thanks for pointing it out. I’ll fix it again.

@leitzler Looking at the profiles… it looks like it’s exactly the same issue. runtime.tgkill is suddenly at the top of the profile and it comes from signalM, which in turn comes from preemptM. Then it follows the same path up to enlistWorker.

Yeah, I think this is consistent with our previous analysis. There is no contention if there is just one thread. The more threads, the more threads trying to preempt each other, thus the heavier contention.