go: testing: when using a custom TestMain, m.Run does not return if one of the tests it runs panics

What version of Go are you using (go version)?

$ go version
go version go1.13.7 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/yury/Library/Caches/go-build"
GOENV="/Users/yury/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/yury/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/yury/go/src/github.com/orlangure/myproject/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/p1/rjgq2gp55pj58ckbsn94yfz80000gn/T/go-build564127360=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I wrote a function, a test for it, and TestMain to perform setup and teardown for testing this function:

main.go

package main

import "fmt"

func main() {
	fmt.Println("vim-go")
}

func p() {
	panic("foo")
}

p_test.go

package main

import "testing"

func TestP(t *testing.T) {
	p()
}

main_test.go

package main

import (
	"fmt"
	"os"
	"testing"
)

func TestMain(m *testing.M) {
	os.Exit(testMain(m))
}

func testMain(m *testing.M) int {
	setup()
	defer teardown()

	return m.Run()
}

func setup() {
	fmt.Println("setting up")
}

func teardown() {
	fmt.Println("tearing down")
}

What did you expect to see?

I expected to see “setting up” and “tearing down” at some point, and a panic “foo” message with a stack trace.

What did you see instead?

Only setup and panic output appeared:

setting up
--- FAIL: TestP (0.00s)
panic: foo [recovered]
        panic: foo

goroutine 19 [running]:
testing.tRunner.func1(0xc0000b6100)
        /usr/local/go/src/testing/testing.go:874 +0x3a3
panic(0x1111220, 0x116b4d0)
        /usr/local/go/src/runtime/panic.go:679 +0x1b2
akeyless.io/akeyless-main-repo/go/src/t.p(...)
...
FAIL

The program probably called os.Exit at some point while running the tests, or panicked in a separate go routine with no way for me to recover. I would expect both setup and teardown functions to be called at some point.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 4
  • Comments: 22 (19 by maintainers)

Most upvoted comments

If I get CL 134395 cleaned up and merged, it will be fairly straightforward for users to use an errgroup to structure their code such that all goroutines that may panic propagate that panic back to the main goroutine.

Moreover, I suspect that the vast majority of tests do not explicitly spawn goroutines, let alone goroutines that may panic. It seems reasonable for users to expect that their best-effort cleanup using defer will actually be invoked most of the time.

We can never make this perfect. If some code in a test calls

    go func() { panic("die die die") }()

then m.Run is not going to return no matter what we do.

The only question here is how we should handle a test that panics in the goroutine used to run the test. Should we try to handle that specific case? Or should we just treat it like a panic in a different goroutine, which is in effect what we do now?

I don’t have a strong opinion about that. But the testing package is already fairly baroque. Is it really worth complicating it further to handle one specific case when we can’t handle other similar cases?

Change https://golang.org/cl/219977 mentions this issue: testing: allow m.Run return if a test panics

@progressnerd, an unexpected panic potentially leaves the program in an arbitrarily corrupted state. Swamping the panic in noise from subsequent test failures — or worse, deadlocks — would make the panic harder to diagnose, in addition to delaying the output of the panic, for at best marginal additional information from the test.

I personally am not convinced by “let’s do this now and figure out how to do more later.” Sometimes that is the right approach, but sometimes it’s important to at least understand how we could do more later.

Right now people are confused because when they panic in a test deferred functions in TestMain are not run. If we fix that, people will be confused because when they panic in a goroutine started by a test deferred functions in TestMain are not run. Although the current state is not good, at least it’s consistent.

My opinion is … until a time when all panics can be handled …

I don’t see any way of doing it, handling all panics seems to be equivalent to the status in GC and mutators. That is to say, a user goroutine may panic intentionally, the runtime should not do anything with it by default. This basically matches what I said in the beginning: “The testing package has done everything it could achieve with the CL. If we interested in handling this particular panic case, more investigation could be done in subsequent CLs.”

cc @dmitshur

I was answering an opinion.

This particular case can simply be ignored because this type of panic happens in a user goroutine. If we interested in handling this particular panic case, more investigation could be done in subsequent CLs.