go: crypto/tls: linux/arm64 Go 1.8 performance is slow, max 12.5 MB/sec

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.8 linux/arm64

What operating system and processor architecture are you using (go env)?

96-core Cavium ThunderX, Packet type “2A” server

root@docker-build-test:~# go env
GOARCH="arm64"
GOBIN=""
GOEXE=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root"
GORACE=""
GOROOT="/usr/lib/go-1.8"
GOTOOLDIR="/usr/lib/go-1.8/pkg/tool/linux_arm64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build192910166=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

root@docker-build-test:~# go test crypto/tls -bench BenchmarkThroughput

What did you expect to see?

TLS performance on my 96-core ARMv8 server faster than my laptop.

What did you see instead?

root@docker-build-test:~# go test crypto/tls -bench BenchmarkThroughput

BenchmarkThroughput/MaxPacket/1MB-96                  10         128291858 ns/op    8.17 MB/s
BenchmarkThroughput/MaxPacket/2MB-96                   5         211866625 ns/op    9.90 MB/s
BenchmarkThroughput/MaxPacket/4MB-96                   3         378852259 ns/op   11.07 MB/s
BenchmarkThroughput/MaxPacket/8MB-96                   2         715603298 ns/op   11.72 MB/s
BenchmarkThroughput/MaxPacket/16MB-96                  1        1387017225 ns/op   12.10 MB/s
BenchmarkThroughput/MaxPacket/32MB-96                  1        2713806130 ns/op   12.36 MB/s
BenchmarkThroughput/MaxPacket/64MB-96                  1        5402023727 ns/op   12.42 MB/s
BenchmarkThroughput/DynamicPacket/1MB-96              10         128462369 ns/op    8.16 MB/s
BenchmarkThroughput/DynamicPacket/2MB-96               5         211779553 ns/op    9.90 MB/s
BenchmarkThroughput/DynamicPacket/4MB-96               3         378591737 ns/op   11.08 MB/s
BenchmarkThroughput/DynamicPacket/8MB-96               2         711548140 ns/op   11.79 MB/s
BenchmarkThroughput/DynamicPacket/16MB-96              1        1385720232 ns/op   12.11 MB/s
BenchmarkThroughput/DynamicPacket/32MB-96              1        2711156682 ns/op   12.38 MB/s
BenchmarkThroughput/DynamicPacket/64MB-96              1        5378659024 ns/op   12.48 MB/s
PASS
ok      crypto/tls      36.894s

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 3
  • Comments: 19 (7 by maintainers)

Commits related to this issue

Most upvoted comments

We plan to optimize AES for arm64 in Go1.10. But there is some uncertainty for AES and other optimizations if our patch (CL41654) of go syntax extension for SIMD can’t be merged as soon as the tree is open.

@vielmetti @cherrymui @ianlancetaylor @vkrasnov

Sorry for late response and I’m taking sick leave recently. Since some performance issues mentioned by cloudflare have been fixed or are being fixed, I just create following 4 issues to track the most important ones confirmed by Vlad from cloudflare. https://github.com/golang/go/issues/22806 https://github.com/golang/go/issues/22807 https://github.com/golang/go/issues/22808 https://github.com/golang/go/issues/22809

Engineers from cloudflare and arm will cooperate on fixing these issues.

The poor performance won’t surprise me since crypto has not been accelerated by hardware for arm64 and we have planned to optimize AES and others this year.

In https://blog.cloudflare.com/arm-takes-wing/ there are a number of benchmarks with poor results for Go on Arm compared to Go on Intel.

From an issue tracking point of view, I think it makes sense to open up individual issue reports on each of them, rather than overloading this one.

A report of work done to address this:

https://twitter.com/jgrahamc/status/988812004499607553