kubernetes: client-go memory leak

What happened:

We use client-go informerFactory to handle some pod information,

informerFactory.Core().V1().Pods().Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{ AddFunc: addPodToCache, UpdateFunc: updatePodInCache, DeleteFunc: deletePodFromCache, })

cache := make(map[string]*v1.Pod) addPodToCache -> AddPod(pod *v1.Pod) { key, _ := framework.GetPodKey(pod) cache[key]=pod }

deletePodFromCache->RemovePod(pod *v1.Pod) error{ key, _ := framework.GetPodKey(pod) delete(cache, key) }

But we came across memory leak problem recently.

I add some log, and finally find the reason.

Our k8s version is 1.17 with more than 5500 node, but the default-watch-cache-size of kube-apiserver is 100 which is too small for our cluster. (The last k8s already use dynamic size watch-cache #https://github.com/kubernetes/kubernetes/issues/90058)

Reflector.ListAndWatch

Reflector List function costs more than 35 second, and then Reflector use the last resourceVersion of List to call Watch function. Reflector Watch get an error too old resource version: 6214379869 (6214383056) because the last resourceVersion of List has been expired in default-watch-cache-size of kube-apiserver

Reflect.Run will keep the cycle List -> Watch - get too old resource version error

Then a memory leak occurred. We use cache map to store pod information, and the cache will contain pods in different PodList. This will prevent golang from gc the whole PodList

The code causing memory leak is meta.EachListItem https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/api/meta/help.go#L115

Replacing found = append(found, item) with found = append(found, item.DeepCopyObject()) can fix the problem. https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/cache/reflector.go#L453

What you expected to happen:

Calling DeepCopyObject has some memory overhead, I wonder if there is a better solution for this.

How to reproduce it (as minimally and precisely as possible):

import (
        "fmt"
        goruntime "runtime"
        "time"

        v1 "k8s.io/api/core/v1"
        "k8s.io/apimachinery/pkg/api/meta"
        metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
        "k8s.io/apimachinery/pkg/runtime"
)

func main() {
        leak()
}

func unleak() {
        store := make(map[string]*v1.Pod)
        var index int
        for {
                plist := generatePod(index)
                for _, pod := range plist.Items {
                        pod := pod
                        store[pod.Name] = &pod
                }
                time.Sleep(time.Second * 2)
                index++
                goruntime.GC()
                fmt.Println("==unleak==", index, len(store))
        }
}

func leak() {
        store := make(map[string]*v1.Pod)
        var index int
        var items []runtime.Object
        for {
                items = items[:0]
                plist := generatePod(index)
                meta.EachListItem(plist, func(obj runtime.Object) error {
                        items = append(items, obj)
                        return nil
                })
                for _, item := range items {
                        pod := item.(*v1.Pod)
                        store[pod.Name] = pod
                }
                time.Sleep(time.Second * 2)
                index++
                goruntime.GC()
                fmt.Println("==leak==", index, len(store))
        }
}

func generatePod(num int) *v1.PodList {
        var plist v1.PodList
        for i := 0; i < 100000-num; i++ {
                pod := v1.Pod{
                        ObjectMeta: metav1.ObjectMeta{
                                Name: fmt.Sprintf("pod-%d", i),
                        },
                }
                plist.Items = append(plist.Items, pod)
        }
        return &plist
}

leak

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 21 (15 by maintainers)

Most upvoted comments

The leak here is caused by the golang slice gc mechanism, and there is no memory leak in the for...range mode because of the copy mechanism of v in for_, v := range slice. If the unleak function is changed to the following code (equivalent to the mode of using reflect in meta.EachListItem), there will be an incorrect situation where the RES value monotonously increases. I suggest adding DeepCopy to meta.EachListItem to alleviate this situation.

func unleak() {
	store := make(map[string]*v1.Pod)
	var index int
	for {
		plist := generatePod(index)
		for i := 0; i < len(plist.Items); i++  {
			store[plist.Items[i].GetName()] = &plist.Items[i]
			//store[pod.Name] = &pod
		}
		time.Sleep(time.Second * 2)
		index++
		goruntime.GC()
		fmt.Println("==unleak==", index, len(store))
	}
}

@wojtek-t PTAL

sxllwx on Jun 14, 2021