kubernetes: client-go memory leak
What happened:
We use client-go informerFactory to handle some pod information,
informerFactory.Core().V1().Pods().Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{ AddFunc: addPodToCache, UpdateFunc: updatePodInCache, DeleteFunc: deletePodFromCache, })
cache := make(map[string]*v1.Pod) addPodToCache -> AddPod(pod *v1.Pod) { key, _ := framework.GetPodKey(pod) cache[key]=pod }
deletePodFromCache->RemovePod(pod *v1.Pod) error{ key, _ := framework.GetPodKey(pod) delete(cache, key) }
But we came across memory leak problem recently.
I add some log, and finally find the reason.
Our k8s version is 1.17 with more than 5500 node, but the default-watch-cache-size of kube-apiserver is 100 which is too small for our cluster.
(The last k8s already use dynamic size watch-cache #https://github.com/kubernetes/kubernetes/issues/90058)
Reflector List function costs more than 35 second, and then Reflector use the last resourceVersion of List to call Watch function. Reflector Watch get an error too old resource version: 6214379869 (6214383056) because the last resourceVersion of List has been expired in default-watch-cache-size of kube-apiserver
Reflect.Run will keep the cycle List -> Watch - get too old resource version error
Then a memory leak occurred. We use cache map to store pod information, and the cache will contain pods in different PodList. This will prevent golang from gc the whole PodList
The code causing memory leak is meta.EachListItem
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/api/meta/help.go#L115
Replacing found = append(found, item) with found = append(found, item.DeepCopyObject()) can fix the problem.
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/cache/reflector.go#L453
What you expected to happen:
Calling DeepCopyObject has some memory overhead,
I wonder if there is a better solution for this.
How to reproduce it (as minimally and precisely as possible):
import (
"fmt"
goruntime "runtime"
"time"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/meta"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
)
func main() {
leak()
}
func unleak() {
store := make(map[string]*v1.Pod)
var index int
for {
plist := generatePod(index)
for _, pod := range plist.Items {
pod := pod
store[pod.Name] = &pod
}
time.Sleep(time.Second * 2)
index++
goruntime.GC()
fmt.Println("==unleak==", index, len(store))
}
}
func leak() {
store := make(map[string]*v1.Pod)
var index int
var items []runtime.Object
for {
items = items[:0]
plist := generatePod(index)
meta.EachListItem(plist, func(obj runtime.Object) error {
items = append(items, obj)
return nil
})
for _, item := range items {
pod := item.(*v1.Pod)
store[pod.Name] = pod
}
time.Sleep(time.Second * 2)
index++
goruntime.GC()
fmt.Println("==leak==", index, len(store))
}
}
func generatePod(num int) *v1.PodList {
var plist v1.PodList
for i := 0; i < 100000-num; i++ {
pod := v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("pod-%d", i),
},
}
plist.Items = append(plist.Items, pod)
}
return &plist
}

Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version): - Cloud provider or hardware configuration:
- OS (e.g:
cat /etc/os-release): - Kernel (e.g.
uname -a): - Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (15 by maintainers)
The leak here is caused by the golang slice gc mechanism, and there is no memory leak in the
for...rangemode because of the copy mechanism ofvinfor_, v := range slice. If theunleakfunction is changed to the following code (equivalent to the mode of using reflect inmeta.EachListItem), there will be an incorrect situation where the RES value monotonously increases. I suggest addingDeepCopyto meta.EachListItem to alleviate this situation.@wojtek-t PTAL