go: runtime: append growth sort of inconsistent and surprising

What version of Go are you using (`go version`)?

1.15

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (`go env`)?

Any, but also playground.

What did you do?

This actually grew out of a different question. Consider the fairly common slice-insert idiom:

s = append(s, 0)
copy(s[i+1:], s[i:])
s[i] = inserted

Assume i is less than len(s). You could also do:

s = append(s[:i+1], s[i:])
s[i] = inserted

I thought this might be faster because it sort of skips one copy of the s[i+1:] part of the slice – it can in theory copy each half only once. In fact, it was dramatically slower, and allocated more memory, which surprised me. Further investigation led to:

https://play.golang.org/p/cR51ks72emv

What did you expect to see?

I don’t know. I think I expected the behavior of slice growth on append to be the same regardless of how I achieved the outcome of “increase total length by 1”.

What did you see instead?

Two different things, both somewhat surprising to me. The first is that, starting at N=1024, appending two slices produces wildly different resulting capacities than appending a couple of items to a slice. (Curiously, appending two items explicitly, like “append(s, 0, 1)” doesn’t produce the different behavior, but appending b[size-1:]... does produce the single-item-append behavior.) Specifically, appending a single item grows the slice by 25%, but appending “a slice” does something much fancier. It will at least double the capacity of the first parameter to append(), but if you specify a cap on that parameter, it will then pick a size that’s some amount larger than the total size it needs, but not necessarily as large. It might even be smaller than the size picked when appending a single item to the existing slice.

The second is that, for N=1023 exactly, the single-item append always goes straight to 2048. So, appending a single item to a 1023-item slice produces a 2048 capacity, but appending a single item to a 1024-or-larger item slice almost always produces a smaller capacity. This seems like a weird special case, and might be a bug in the sense of “a check that is supposed to do something specific is doing something else specific”. (Or it might just be "grow by doubling until 1024, then grow by 25%, I guess?) But batch-append doesn’t have that behavior. So, for N from 850 to 1023, batch-append grows the slice more slowly (and thus, allocates less memory and has to do less initialization in this particular use case), but for N from 1024 on up, batch-append tends to grow the slice faster (and thus has to do more extra work). It feels like they should be roughly comparable, probably?

I am not sure what the right outcome is. I think what I’d ideally like would be for the short-cut append to have the same semantics (when it’s valid at all) including the amount by which capacity grows.

The “append instead of appending and copying” behavior actually seems moderately rewarding (a few percent faster, sometimes), but it becomes more valuable as N increases, but currently, it’s only actually better when N is large and the index is small so the array isn’t accidentally doubling in size. If they used the same growth sizing logic, I think it would be a potentially significant win for some use cases.

(Thanks to josharian for pointing out that the capacity of the first slice would be a significant factor.)

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 32 (17 by maintainers)

Links to this issue

TIL: make+copy to clone a slice is optimized in Go 1.15, which is not mentioned in 1.15 release notes.

Most upvoted comments

I think there is something to fix here. I would expect that both of these appends should allocate the same amount:

func main() {
	const N = 1024
	var a [N]int
	println(cap(append(a[:N-1:N], 9, 9)))
	println(cap(append(a[:N:N], 9)))
}

Both overflow the same amount and need a growslice, and both started with the same capacity slice. The fact that one needed to fill in one available slot in the cap-len space first seems like it shouldn’t matter.

This isn’t to say that callers should depend on this behavior. But growth should ideally not depend on how elements were added (one at a time, two at a time, whatever) but just on how many were added. (With the exception that if you add so many in a single append that you would overflow what we would do for a single grow, then of course we have to grow more.)

randall77 on Sep 9, 2020

Although it is not a must do, I think it would be great to make the reflect.Append be consistent with the non-reflection way.

(The following outputs are the same for 1.15 and tip):

package main

import "reflect"

const N = 1024

func main() {
	b := make([]int, N)
	b = reflect.Append(
		reflect.ValueOf(b[:N-1:N]),
		reflect.ValueOf(0),
		reflect.ValueOf(1),
	).Interface().([]int)
	println(len(b), cap(b)) // 1025 2048
	c := make([]int, N)
	c = reflect.AppendSlice(
		reflect.ValueOf(c[:N:N]),
		reflect.ValueOf([]int{1}),
	).Interface().([]int)
	println(len(c), cap(c)) // 1025 1280
}

go101 on Oct 6, 2020

Similar to @ianlancetaylor I think the slice capacities that append produces should not be relied on to be consistent between call sites or versions.

I think the difference in result capacities is due to runtime growslice taking the starting capacity and length into account which is different between the two versions: https://github.com/golang/go/blob/c489330987eca992cee0bb018a6fdb7ff5401704/src/runtime/slice.go#L144

So given different starting parameters I dont think these two versions need to come up with the same resulting capacities.

martisch on Sep 6, 2020