kepler: Kubelet update is blocked by network IO?
When I try to add benchmark for updateKubeletMetrics
I found it seems a blocked network IO?
Here is the tracing stack:
func GetContainerMetrics() (containerCPU, containerMem map[string]float64, nodeCPU, nodeMem float64, retErr error) {
return podLister.ListMetrics()
}
// ListMetrics accesses Kubelet's metrics and obtain pods and node metrics
func (k *KubeletPodLister) ListMetrics() (containerCPU, containerMem map[string]float64, nodeCPU, nodeMem float64, retErr error) {
resp, err := httpGet(metricsURL)
...
hence each time for updateKubeletMetrics
, before it goes to loop containers/pods, there is a IO at networking to get and convert kubelet.
Usually we thought IO is blocking, as the major time cost, and meaning less for benchmarking based on container.
Suggestion as decouple this network IO from code stack.
and btw, why we rebuild constants below for each container?
cpuMetricName := collector_metric.AvailableKubeletMetrics[0]
memMetricName := collector_metric.AvailableKubeletMetrics[1]
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 18 (1 by maintainers)
Thanks @rootfs, it makes sense.
So it will also be related to the discussion in #605 an #558 of using an external lib to collect cgroup metrics. The external lib should support both v1 and v2, then we will not need the kubelet metrics anymore
The following is my mental map. For cgroup v1, kubelet metrics is the way to go.