go: strings: copying but not using a Builder leads to runtime crash (generates heap -> stack pointer)

What version of Go are you using (go version)?

$ go version
go version go1.16.6

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

What did you do?

https://play.golang.org/p/RRI3-srzVrR

package main

import (
	"fmt"
	"strings"
)

func f(myList []string) error {
	const (
		Threshold = 1000
	)

	var (
		builderList = []strings.Builder{}
	)
	var (
		bullder = strings.Builder{}
	)
	var (
		count uint64 = 0
	)
	for _, _ = range myList {
		if count > Threshold {
			builderList = append(builderList, bullder)
			bullder = strings.Builder{}
			count = 0
		}
		bullder.WriteString("123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890")
		count++
	}
	return nil
}

func main() {
	myList := make([]string, 1000000)
	err := f(myList)
	if err != nil {
		panic(err)
	}
	fmt.Println("success")
}

What did you expect to see?

I don’t see any error message

What did you see instead?

--
runtime: pointer 0xc035a7bc78 to unused region of span span.base()=0xc0001de000 span.limit=0xc0001dff80 span.state=1
  | 2021-07-19T12:43:00.012+09:00 | runtime: found in object at *(0xc0026a6000+0x0)
  | 2021-07-19T12:43:00.012+09:00 | object=0xc0026a6000 s.base()=0xc0026a6000 s.limit=0xc0026a8000 s.spanclass=36 s.elemsize=256 s.state=mSpanInUse
  | 2021-07-19T12:43:00.012+09:00 | *(object+0) = 0xc035a7bc78 <==
  | 2021-07-19T12:43:00.012+09:00 | *(object+8) = 0xc002700000
  | 2021-07-19T12:43:00.012+09:00 | *(object+16) = 0x113a6c
  | 2021-07-19T12:43:00.012+09:00 | *(object+24) = 0x128000
  | 2021-07-19T12:43:00.012+09:00 | *(object+32) = 0xc035a7bc78
  | 2021-07-19T12:43:00.012+09:00 | *(object+40) = 0xc0017ce000
  | 2021-07-19T12:43:00.012+09:00 | *(object+48) = 0x1100aa
  | 2021-07-19T12:43:00.012+09:00 | *(object+56) = 0x128000
  | 2021-07-19T12:43:00.012+09:00 | *(object+64) = 0xc00008fc78
  | 2021-07-19T12:43:00.012+09:00 | *(object+72) = 0xc002300000
  | 2021-07-19T12:43:00.012+09:00 | *(object+80) = 0x115fcc
  | 2021-07-19T12:43:00.012+09:00 | *(object+88) = 0x128000
  | 2021-07-19T12:43:00.012+09:00 | *(object+96) = 0xc00008fc78
  | 2021-07-19T12:43:00.012+09:00 | *(object+104) = 0xc0018f6000
  | 2021-07-19T12:43:00.012+09:00 | *(object+112) = 0x115de0
  | 2021-07-19T12:43:00.012+09:00 | *(object+120) = 0x128000
  | 2021-07-19T12:43:00.012+09:00 | *(object+128) = 0xc00008fc78
  | 2021-07-19T12:43:00.012+09:00 | *(object+136) = 0xc002992000
  | 2021-07-19T12:43:00.013+09:00 | *(object+144) = 0x114359
  | 2021-07-19T12:43:00.013+09:00 | *(object+152) = 0x128000
  | 2021-07-19T12:43:00.013+09:00 | *(object+160) = 0xc00008fc78
  | 2021-07-19T12:43:00.013+09:00 | *(object+168) = 0xc001a1e000
  | 2021-07-19T12:43:00.013+09:00 | *(object+176) = 0x11aa0e
  | 2021-07-19T12:43:00.013+09:00 | *(object+184) = 0x128000
  | 2021-07-19T12:43:00.013+09:00 | *(object+192) = 0xc00008fc78
  | 2021-07-19T12:43:00.013+09:00 | *(object+200) = 0xc001f80000
  | 2021-07-19T12:43:00.013+09:00 | *(object+208) = 0x113b9b
  | 2021-07-19T12:43:00.013+09:00 | *(object+216) = 0x128000
  | 2021-07-19T12:43:00.013+09:00 | *(object+224) = 0x0
  | 2021-07-19T12:43:00.013+09:00 | *(object+232) = 0x0
  | 2021-07-19T12:43:00.013+09:00 | *(object+240) = 0x0
  | 2021-07-19T12:43:00.013+09:00 | *(object+248) = 0x0
  | 2021-07-19T12:43:00.013+09:00 | fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?)
  | 2021-07-19T12:43:00.016+09:00 | runtime stack:
  | 2021-07-19T12:43:00.016+09:00 | runtime.throw(0x98583e, 0x3e)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/panic.go:1117 +0x72 fp=0xc000049e28 sp=0xc000049df8 pc=0x435db2
  | 2021-07-19T12:43:00.016+09:00 | runtime.badPointer(0x7f0421f06050, 0xc035a7bc78, 0xc0026a6000, 0x0)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mbitmap.go:351 +0x235 fp=0xc000049e70 sp=0xc000049e28 pc=0x415335
  | 2021-07-19T12:43:00.016+09:00 | runtime.findObject(0xc035a7bc78, 0xc0026a6000, 0x0, 0x0, 0x0, 0x0)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mbitmap.go:387 +0x9b fp=0xc000049ea8 sp=0xc000049e70 pc=0x4153fb
  | 2021-07-19T12:43:00.016+09:00 | runtime.scanobject(0xc0026a6000, 0xc000027698)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mgcmark.go:1286 +0x24a fp=0xc000049f38 sp=0xc000049ea8 pc=0x42138a
  | 2021-07-19T12:43:00.016+09:00 | runtime.gcDrain(0xc000027698, 0x3)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mgcmark.go:1048 +0x22b fp=0xc000049f90 sp=0xc000049f38 pc=0x420b0b
  | 2021-07-19T12:43:00.016+09:00 | runtime.gcBgMarkWorker.func2()
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mgc.go:1980 +0x88 fp=0xc000049fd0 sp=0xc000049f90 pc=0x461f28
  | 2021-07-19T12:43:00.016+09:00 | runtime.systemstack(0x0)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/asm_amd64.s:379 +0x66 fp=0xc000049fd8 sp=0xc000049fd0 pc=0x4693e6
  | 2021-07-19T12:43:00.016+09:00 | runtime.mstart()
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/proc.go:1246 fp=0xc000049fe0 sp=0xc000049fd8 pc=0x43b080
  | 2021-07-19T12:43:00.016+09:00 | goroutine 34 [GC worker (idle)]:
  | 2021-07-19T12:43:00.016+09:00 | runtime.systemstack_switch()
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/asm_amd64.s:339 fp=0xc000030f60 sp=0xc000030f58 pc=0x469360
  | 2021-07-19T12:43:00.016+09:00 | runtime.gcBgMarkWorker()
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mgc.go:1967 +0x1c7 fp=0xc000030fe0 sp=0xc000030f60 pc=0x41d627
  | 2021-07-19T12:43:00.016+09:00 | runtime.goexit()
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc000030fe8 sp=0xc000030fe0 pc=0x46b061
  | 2021-07-19T12:43:00.016+09:00 | created by runtime.gcBgMarkStartWorkers
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mgc.go:1835 +0x37
  | 2021-07-19T12:43:00.016+09:00 | goroutine 1 [runnable]:
  | 2021-07-19T12:43:00.016+09:00 | runtime.Gosched(...)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/proc.go:292
  | 2021-07-19T12:43:00.016+09:00 | runtime.gcAssistAlloc(0xc000000180)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/mgcmark.go:469 +0x1bc fp=0xc00008faa0 sp=0xc00008fa40 pc=0x41f55c
  | 2021-07-19T12:43:00.016+09:00 | runtime.mallocgc(0xbc000, 0x0, 0xc0032c7000, 0x0)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/malloc.go:961 +0x996 fp=0xc00008fb28 sp=0xc00008faa0 pc=0x40ddb6
  | 2021-07-19T12:43:00.016+09:00 | runtime.growslice(0x8b9ec0, 0xc003100000, 0x95dcc, 0x96000, 0x96199, 0xc0032cd000, 0x3cd, 0x400)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/runtime/slice.go:224 +0x154 fp=0xc00008fb90 sp=0xc00008fb28 pc=0x44d6d4
  | 2021-07-19T12:43:00.016+09:00 | strings.(*Builder).WriteString(...)
  | 2021-07-19T12:43:00.016+09:00 | /usr/local/go/src/strings/builder.go:123
  | 2021-07-19T12:43:00.016+09:00 | git.rarejob.com/rarejob-platform/event-search-polling-batch/src/repositories/eselasticsearch.(*elasticSearch).changeTutor(0xc00000e0c0, 0xc034e46000, 0x9727, 0x9727, 0x9798ab, 0x17, 0x0, 0x0)
  | 2021-07-19T12:43:00.016+09:00 | /builds/rarejob-platform/event-search-polling-batch/src/repositories/eselasticsearch/tutors.go:71 +0x392 fp=0xc00008fcc8 sp=0xc00008fb90 pc=0x8666b2
  | 2021-07-19T12:43:00.016+09:00 | git.rarejob.com/rarejob-platform/event-search-polling-batch/src/repositories/eselasticsearch.(*elasticSearch).ChangeAllTutor(0xc00000e0c0, 0xc034e46000, 0x9727, 0x9727, 0x0, 0x0)
  | 2021-07-19T12:43:00.016+09:00 | /builds/rarejob-platform/event-search-polling-batch/src/repositories/eselasticsearch/tutors.go:19 +0x5e fp=0xc00008fd18 sp=0xc00008fcc8 pc=0x86623e
  | 2021-07-19T12:43:00.016+09:00 | git.rarejob.com/rarejob-platform/event-search-polling-batch/src/services.(*service).SetEsElasticsearch(0xc000024360, 0xa0aef0, 0xc00001a040, 0xc00006ba00, 0x0, 0x0)
  | 2021-07-19T12:43:00.016+09:00 | /builds/rarejob-platform/event-search-polling-batch/src/services/es_elasticsearch.go:40 +0x439 fp=0xc00008fe18 sp=0xc00008fd18 pc=0x871739
 ```

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 18 (16 by maintainers)

Most upvoted comments

I did a preliminary experiment of extending the “copylock” checker to check for the copying of “strings.Builder”, e.g.

func lockPath(tpkg *types.Package, typ types.Type) typePath {
    ...
    if named, ok := typ.(*types.Named); ok &&
		named.Obj().Name() == "Builder" &&
		named.Obj().Pkg().Path() == "strings" {
		return []types.Type{typ}
	}

Running this extension of a few open-source cloud projects doesn’t generate any finding. This may indicate that its frequency is too low to be a vet checker.

BTW, here are the unit tests:

func BadCopy() {
	var x *strings.Builder
	p := x
	var y strings.Builder
	_ = y  // want `assignment copies value to _: strings.Builder`
	p = &y
	*p = *x  // want `assignment copies value to \*p: *strings.Builder`

	w := struct{ L strings.Builder }{
		L: *x, // want `literal copies value from \*x: strings.Builder`
	}
	print(w)  // want `call of print copies value: struct{L strings.Builder} contains strings.Builder`

	builderList := []strings.Builder{}
	bullder := strings.Builder{}
	builderList = append(builderList, bullder)  // want `call of append copies value: strings.Builder`
}

The other option is to fix #7921. Then we can remove the noescape annotation.

I think I have a hunch why this happens. The alleged bad pointer in the original post is supposed to be a builder pointing to itself (note that the GC was currently scanning builderList). That extra self pointer is how it detects if it’s been copied. Thing is, the copy check never fires because there’s never a builder method executed after it is copied (otherwise it would fail).

So the question then is why the GC has any problems with this. I believe the answer is that bullder (sic) is stack-allocated, and the self pointer points into a stack location, however that pointer is copied to the heap. The GC has an invariant that heap objects can’t point back into a stack, hence the failure.

The reason why this happens is the tricky noescape used in the strings.Builder self pointer assignment (the copyCheck method). Normally the compiler would force the strings.Builder to escape, so the invariant would be maintained.

As others have said, this is a valid failure mode for copying a strings.Builder. It turns out even if you don’t use it, it’s never safe to copy a non-zero strings.Builder. But, as @mpx points out, there’s a UX issue here (and arguably a bug in strings.Builder), but I’m not sure how to fix it.

One possibility is to make the self pointer a uintptr. I think with care, it would actually make this API safe to copy but not use. It would still be a useful marker for the API, but the GC wouldn’t ever try to look at that pointer. I’m almost positive that was already disregarded for a number of reasons back when this was implemented, one issue being the “safe to copy but not use the copy” restriction, another being that uintptr is even less safe, and yet another being that switching GC implementations could break this.

FTR, this is not something can be easily fixed in the runtime: we currently do not carry type information around for any object. Thus GC doesn’t have access to type information, and even if it did, checking for strings.Builder and the invariant would add slow special cases to the scan path, the hottest path in the runtime.

I’m going to reopen this and update the title to indicate the UX issue here. I’ll dig around too and see if I can find a duplicate.

EDIT: Technically the error message is accurate in the sense of “incorrect use of unsafe.” 😃 The strings.Builder API is saying that it’s the user’s fault for not following the API, but the API lets you get into this situation. Who’s at fault?

I wonder if it would be possible to write a vet check to warn against copying a “live” strings.Builder? We already warn when copying sync.Mutex as things stand.