dashboard: high memory usage and slow performance in large clusters

Installation method: helm chart
Kubernetes version: 1.17.3
Dashboard version: 2.01
Metrics provider: metrics-server 0.3.6
Operating system: flatcar linux

Steps to reproduce

In large clusters with more than 1000 pods the dashboard requires excessive amount of memory and is very slow when looking at namespaces with many pods or looking at all namespaces.

Observed result

In a 7000 pod cluster loading the deployment overview page for all namespaces requires more than 24Gi of memory in the dashboard and puts a high load on the metrics which can impact autoscaling workloads as the apiserver fails to answer all requests.

The metrics scraper consumes 5Gi of memory until the dashboard was killed due to OOM.

Only looking at a namespace with 700 pods in it also consumes more than 8Gi memory and the page load takes 30 minutes.

Expected result

It would be nice if there was some form of pagination for the fetched resources so looking at very large namespaces or all namespaces the dashboard does dos the apiserver and also does not die due to memory exhaustion.

On pages where pagination may not work (like the namespace overview cpu usage graphs) it should abort when it is not feasible to display the necessary data.

Comments

If that is out of scope a way to disable the all namespaces view would help for clusters that are configured to allow authenticated users to see resources likes pods and deployments.

Or a way to restrict the dashboard to certain namespaces so one can deploy multiple dashboard instances with limited view rights in multi tenant namespaces.

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 18 (9 by maintainers)

Most upvoted comments

I have opened a #5320 to remember about this issue. This is definitely a long term task as it requires major refactoring of our backend but might help with this issue.

floreks on Jun 30, 2020