go: sync: Pool example suggests incorrect usage
The operation of sync.Pool
assumes that the memory cost of each element is approximately the same in order to be efficient. This property can be seen by the fact that Pool.Get
returns you a random element, and not the one that has “the greatest capacity” or what not. In other words, from the perspective of the Pool
, all elements are more or less the same.
However, the Pool
example stores bytes.Buffer
objects, which have an underlying []byte
of varying capacity depending on how much of the buffer is actually used.
Dynamically growing an unbounded buffers can cause a large amount of memory to be pinned and never be freed in a live-lock situation. Consider the following:
pool := sync.Pool{New: func() interface{} { return new(bytes.Buffer) }}
processRequest := func(size int) {
b := pool.Get().(*bytes.Buffer)
time.Sleep(500 * time.Millisecond) // Simulate processing time
b.Grow(size)
pool.Put(b)
time.Sleep(1 * time.Millisecond) // Simulate idle time
}
// Simulate a set of initial large writes.
for i := 0; i < 10; i++ {
go func() {
processRequest(1 << 28) // 256MiB
}()
}
time.Sleep(time.Second) // Let the initial set finish
// Simulate an un-ending series of small writes.
for i := 0; i < 10; i++ {
go func() {
for {
processRequest(1 << 10) // 1KiB
}
}()
}
// Continually run a GC and track the allocated bytes.
var stats runtime.MemStats
for i := 0; ; i++ {
runtime.ReadMemStats(&stats)
fmt.Printf("Cycle %d: %dB\n", i, stats.Alloc)
time.Sleep(time.Second)
runtime.GC()
}
Depending on timing, the above snippet takes around 35 GC cycles for the initial set of large requests (2.5GiB) to finally be freed, even though each of the subsequent writes only use around 1KiB. This can happen in a real server handling lots of small requests, where large buffers allocated by some prior request end up being pinned for a long time since they are not in Pool
long enough to be collected.
The example claims to be based on fmt
usage, but I’m not convinced that fmt
’s usage is correct. It is susceptible to the live-lock problem described above. I suspect this hasn’t been an issue in most real programs since fmt.PrintX
is typically not used to write very large strings. However, other applications of sync.Pool
may certainly have this issue.
I suggest we fix the example to store elements of fixed size and document this.
About this issue
- Original URL
- State: open
- Created 7 years ago
- Reactions: 39
- Comments: 36 (15 by maintainers)
Commits related to this issue
- fmt: fix usage of sync.Pool The current usage of sync.Pool is leaky because it stores an arbitrary sized buffer into the pool. However, sync.Pool assumes that all items in the pool are interchangeabl... — committed to golang/go by dsnet 6 years ago
- Reset bytes.Buffer before returning it to sync.Pool Most of bytes.Buffer in stargz-snapshotter is unbounded in terms of size and sync.Pool's Get may choose to ignore the pool. Resizing a buffer befo... — committed to kzys/stargz-snapshotter by kzys 3 years ago
- jar: do not keep large buffers unnecessarily This uses a sync.Pool wrapper (called pool.Dynamic) that prevents the pool from holding on to very large buffers indefinitely, while still amortizing the ... — committed to aktau/log4jscanner by aktau 2 years ago
- jar: do not keep large buffers unnecessarily This uses a sync.Pool wrapper (called pool.Dynamic) that prevents the pool from holding on to very large buffers indefinitely, while still amortizing the ... — committed to google/log4jscanner by aktau 2 years ago
- log: reduce lock contention Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lo... — committed to golang/go by dsnet a year ago
- log: reduce lock contention Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lo... — committed to Pryz/go by dsnet a year ago
- log: reduce lock contention Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lo... — committed to fancybits/go by dsnet a year ago
- log: reduce lock contention Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lo... — committed to fancybits/go by dsnet a year ago
The solution is of course to put only buffers with small byte slices back into the pool.
Alternatively, you could use an array of
sync.Pools
to bucketize the items by size: https://github.com/golang/go/blob/7e394a2/src/net/http/h2_bundle.go#L998-L1043I guess a minimal example of such a usage would be:
but of course… whether this is going to be faster than the GC depends on how many packets per second you have to deal with and what exactly you do with the data etc. In real world you’d benchmark with GC/sync.Pool and compare the two. At the time we wrote our code there was a significant time spent allocating new stuff and using a scheme as above we’ve managed to increase the throughput. Of course, one should re-benchmark this with every update to the GC.
We would certainly like to get to this point, and the GC has improved a lot, but for high-churn allocations with obvious lifetimes and no need for zeroing,
sync.Pool
can still be a significant optimization. As @RLH likes to say, every use ofsync.Pool
is a bug report on the GC. But we’re still taking those bug reports. 😃That’s clearly true, but even right now it’s partly by chance that these examples are eventually dropping the large buffers. And in the more realistic stochastic mix example, it’s not clear to me that #22950 would make it any better or worse.
I agree with @dsnet’s original point that we should document that
sync.Pool
treats all objects interchangeably, so they should all have roughly the same “weight”. And it would be good to provide some suggestions for what to do in situations where this isn’t the case, and perhaps some concrete examples of poorsync.Pool
usage.@storozhukBM
This is unfortunately the problem with a probabilistic model. I found it to be much less efficient for applications that did continually use a large buffer size. Ideally, we want perfect reuse if an application consistently uses a large buffer size.
I believe that returning a variable-length buffer to the pool needs to consider history of how well the buffer was utilized. With regard to history, you can either go with global history or local history. Global history is more accurate, but then you have to deal with coordination, or you can go with local history, and hope that its good enough. Did you see my code snippet in https://github.com/golang/go/issues/27735#issuecomment-739169121? I only relies on local history and addresses the problem of discarding large buffers when they are underutilized, while ensuring perfect reuse when consistently use large buffers.
We’ve used sync.Pool for dealing with network packets and others do too (such as lucas-clemente/quic-go) because for those use cases you gain performance when using them. However, in those cases
[]byte
is used instead ofbytes.Buffer
and they all have the same capacity and are re-sliced as needed. Before putting them back they are re-sliced toMaxPacketSize
and before reading youGet
a new[]byte
and re-slice it to the size of the network packet.The same we did even for structs such as when parsing packets to a struct
func ParsePacket([]byte data) *Packet
(which overwrites all fields in a struct anyway) which means instead of allocating/freeing thousands of new structs each second they can be re-used usingsync.Pool
which makes things a little bit faster.Would changing the example to use an array (fixed size) rather than a slice solve this problem? In Chihaya, this is how we’ve used sync.Pool and our implementation before it was in the standard library.
I legitimately don’t think there ever was a time to generally recommend
sync.Pool
. I find it a pretty contentious add to the standard library because of how careful and knowledgable of the runtime you need to be in order to use it effectively. If you need optimization at this level, you probably know how to implement this best for your own use case.Sorry to interject randomly, but I saw this thread on Twitter and have strong opinions on this feature.
So I believe, capping the length or capacity of a slice does not actually reduce the size of the underlying backing array. So, this would not actually save or restore any more memory than stuffing the whole buffer anyways. The only way to truly shrink the memory usage of a slice would be to reallocate it at a smaller size, and then copy the data into… but at that point, in this particular use case, that means we would be better off just dropping the whole slice on the ground, and letting the GC clean it up.
@peterbourgon, wouldn’t using
bytes.Buffer
still have the same issue unless the user explicitly enforces some maximum bounds on the buffer length? If so, presumably the same technique that someone could do with a raw[]byte
?I’ve been trying to find a way to use variable-length buffers with
sync.Pool
, and the approach that I’ve been taking is something like: https://github.com/golang/go/issues/27735#issuecomment-739169121Do you have any suggestions for better use cases we could include in the example, that are reasonably compact?
Maybe the solution is not to recommend a sync.Pool at all anymore? This is my understanding from a comment I read about how GC makes this more or less useless