rancher: Listing many large resources can be slow on the ember UI
What kind of request is this (question/bug/enhancement/feature request): Bug/ possibly enhancement
Steps to reproduce (least amount of steps as possible): Deploy 30-50 helm releases to one project. And a few revisions for each. With every release or iteration , rancher becomes slower and slower and at some point I am no longer able to access cluster via web UI at all, it just shows loading icon and kicks you back out to cluster list page
Result:
Other details that may be helpful: Helm 3 stores state in kubernetes secret. Looks like very large secrets, looks like rancher is trying to pull on all kubernetes secrets when I try to access UI.
It should be better to exclude helm release secret types, or at least not to include soft of secret and type of helm3 secrets in cluster pollings.
Environment information
- Rancher version (
rancher/rancher
/rancher/server
image tag or shown bottom left in the UI): 2.3.5 - Installation option (single install/HA): HA Rancher is deployed on bare metal k8s cluster and another deployment is ingke
Cluster information
- Cluster type (Hosted/Infrastructure Provider/Custom/Imported):
- Machine type (cloud/VM/metal) and specifications (CPU/memory):
- Kubernetes version (use
kubectl version
):
(paste the output here)
- Docker version (use
docker version
):
(paste the output here)
gz#11894
gz#11629
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 5
- Comments: 15 (4 by maintainers)
“api response times are reasonable” - i’m not sure that’s a true statement from a customer/user point of view. I see the improvement that has been made, but I don’t understand how anything taking 7 seconds is considered “reasonable”. It seems like there needs to be a fundamental logic change in how those resources are delivered.
Testing this will require ensuring any page in the ember UI that requires listing resources is functioning properly. Additionally we should generate ~200 secrets and 200 configmaps with ~125KB of data each to reproduce the excessive response times. Before these fixes are applied listing these resources should respond in ~30 - 40s. After these fixes are applied I would expect to see response times closes to 8 - 10 seconds.
cc: @sgapanovich
The proposed solution would be also useful for ConfigMaps.
We had the same issue. This is what happened:
-> calls to /secrets went from ~20sec to <1sec
Rancher 2.5.5. We are experiencing the same issue on our production cluster. The “secrets?limit=-1&sort=name” call takes >25s to finish. Our cluster is not large - just several projects with low amount of kubernetes objects inside. Our users are forced to use the Cluster Explorer instead of the Rancher UI, since it is very slow. Any ideas ?
This issue makes UI almost unusable in any cluster with large amount of secrets. It doesn’t have to be helm releases, any significant amount of large secrets makes the UI unresponsive, because Rancher tries to get all secrets every time you select a Project. Worse, if there are enough pending requests(just 3-4 people clicking is enough in my case), the pod starts to fail the probes, dies and starts a long re-election during which the Rancher is basically completely unavailable.