go: plugin: program on linux/s390x sometimes hangs after calling "plugin.Open"

What version of Go are you using (go version)?

$ go version
go version go1.14.6 linux/s390x

Note: This problem is only found in linux/s390x. All Golang versions have the same issue.

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="s390x"
GOBIN="/usr/local/go/bin/"
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="s390x"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/ua/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_s390x"
GCCGO="gccgo"
AR="ar"
CC="s390x-linux-gnu-gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -march=z196 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build262588498=/tmp/go-build -gno-record-gcc-switches"

What did you do?

This is a sample code (ttt.go):

package main

import (
	"fmt"
	"plugin"
)

func openPlugin(name string) {
	fmt.Printf("openPlugin Start: %s\n", name)
	_, err := plugin.Open(name)
	fmt.Printf("openPlugin End: %s\n", name)
	if err != nil {
		fmt.Printf("openPlugin Error: $s - %v\n", name, err)
	}
}

func main() {
	openPlugin("cpu.so")
	openPlugin("disk.so")
	openPlugin("diskio.so")
	openPlugin("mem.so")
	openPlugin("net.so")
	openPlugin("processes.so")
	openPlugin("procstat.so")
	openPlugin("system.so")
	openPlugin("jaeger.so")
	openPlugin("zipkin.so")
}

What did you expect to see?

$ ./ttt
openPlugin Start: cpu.so
openPlugin End: cpu.so
openPlugin Start: disk.so
openPlugin End: disk.so
openPlugin Start: diskio.so
openPlugin End: diskio.so
openPlugin Start: mem.so
openPlugin End: mem.so
openPlugin Start: net.so
openPlugin End: net.so
openPlugin Start: processes.so
openPlugin End: processes.so
openPlugin Start: procstat.so
openPlugin End: procstat.so
openPlugin Start: system.so
openPlugin End: system.so
openPlugin Start: jaeger.so
openPlugin End: jaeger.so
openPlugin Start: zipkin.so
openPlugin End: zipkin.so

What did you see instead?

If you run multiple times, you can see the problem randomly:

$ ./ttt
openPlugin Start: cpu.so
openPlugin End: cpu.so
openPlugin Start: disk.so
openPlugin End: disk.so
openPlugin Start: diskio.so
openPlugin End: diskio.so
openPlugin Start: mem.so
openPlugin End: mem.so
openPlugin Start: net.so
openPlugin End: net.so
openPlugin Start: processes.so
openPlugin End: processes.so
openPlugin Start: procstat.so
openPlugin End: procstat.so
openPlugin Start: system.so
openPlugin End: system.so
openPlugin Start: jaeger.so
openPlugin End: jaeger.so
openPlugin Start: zipkin.so
openPlugin End: zipkin.so

$ ./ttt
openPlugin Start: cpu.so
openPlugin End: cpu.so
openPlugin Start: disk.so
openPlugin End: disk.so
openPlugin Start: diskio.so
openPlugin End: diskio.so
openPlugin Start: mem.so
openPlugin End: mem.so
openPlugin Start: net.so
openPlugin End: net.so
openPlugin Start: processes.so
openPlugin End: processes.so
openPlugin Start: procstat.so
openPlugin End: procstat.so
openPlugin Start: system.so
openPlugin End: system.so
openPlugin Start: jaeger.so
openPlugin End: jaeger.so
openPlugin Start: zipkin.so
openPlugin End: zipkin.so

$ ./ttt
openPlugin Start: cpu.so
openPlugin End: cpu.so
openPlugin Start: disk.so
openPlugin End: disk.so
openPlugin Start: diskio.so
openPlugin End: diskio.so
openPlugin Start: mem.so
openPlugin End: mem.so
openPlugin Start: net.so
openPlugin End: net.so
openPlugin Start: processes.so
openPlugin End: processes.so
openPlugin Start: procstat.so
openPlugin End: procstat.so
openPlugin Start: system.so
openPlugin End: system.so
openPlugin Start: jaeger.so
openPlugin End: jaeger.so
openPlugin Start: zipkin.so

==> hang without End of zipkin.so

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 21 (13 by maintainers)

Most upvoted comments

Change https://golang.org/cl/249448 mentions this issue: cmd/internal/obj: fix inline marker issue on s390x

Backport issue(s) opened: #40693 (for 1.15), #40694 (for 1.14).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot please open backport issues.

This is a bug that causes programs to intermittently crash with no workaround.

Can the original Prog be repurposed as one of the two GOT lookup instructions? That way it would remain a valid target. Another technique (x86 uses this) is to rewrite the original to a nop, but leave it in the Prog list.

Yeah, either would work. The second way is what I’ve implemented since it is a simpler change in my opinion.

The inlmark nops should be real nops, that generate actual code. We should never remove them. So I think they would need to be their own op, not a marker for a following op. (Because what if there are 2 of them in a row?)

I think it would be relatively easy to ensure and enforce that no two inline markers point to the same PC value. The advantage of it being a marker op rather than a code generating op would be that we could continue to use pre-existing instructions as inline marker targets and not just additional real nops. We’d still need the compiler to ensure a valid instruction followed the marker op.

What, if anything, do you think we need to fix for 1.15?

I’d like to do the small fix to the s390x backend that is suitable for backporting to 1.15.x and 1.14.x release branches to fix this specific bug. This isn’t a recent regression so I don’t think we should rush a fix into 1.15 if it is going to be released very soon, though if it does make it then great. Then for 1.16 maybe we try something more comprehensive.