dashboard: UI becomes unresponsive when there are a large number of completed/running taskruns
Describe the bug
The browser crashes due to the amount of TaskRuns loaded at the same time
Expected behaviour
The browser does not crash simply from opening the page
Steps to reproduce the bug
Have ~1500-2000 completed and/or running taskruns, might probably also happen for other types
Environment details
- Kubernetes Platform: Kubernetes
- Kubernetes or OpenShift version: AKS 1.22.6
- Install mode (if on OpenShift): Helm chart
- Cloud-provider/provisioner: AKS
- Versions:
- Tekton Dashboard: v0.24.1
- Tekton Pipelines: v0.33.2
- Install namespaces:
- Tekton Dashboard: tekton-pipelines
- Tekton Pipelines: tekton-pipelines
Additional Info
Some pagination would probably be helpful
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 18 (10 by maintainers)
Client-side pagination is now available on all list pages in the latest nightly release, e.g. https://storage.googleapis.com/tekton-releases-nightly/dashboard/previous/v20220426-14cce744d6/tekton-dashboard-release.yaml
Thanks again @rouke-broersma @maartengo for reporting the issue and helping to validate the change.
This will be included in the next Dashboard release, v0.26.0 due May 5 - 10.
Good news! Although the page load time is still ~5 seconds, the page is actually responsive afterwards! This is a huge improvement over the regular freezes we used to have.
I haven’t had time to work on this recently but will pick it up again soon. The existing PR needs a bit of cleanup and some tests, then we should be able to apply the same changes to PipelineRuns. I’ll update the PR by middle of next week 🤞
I’ve updated the PR with a quick first pass at adding pagination to all list pages using a slightly different approach. I’ll finish cleaning it up early next week and make sure all tests are passing before marking it ready for review.
Initial testing with https://github.com/tektoncd/dashboard/pull/2327 against our dogfooding cluster shaves ~.5s off 3s load time for 1300 TaskRuns (<1MB) so at least it’s heading in the right direction… I’ll need to increase the number of resources and have some in progress to get a proper feel for what impact this might have for your use case but at least it’s not slower 😄
I’ll see if I can publish a test release later today containing the change, otherwise feel free to pull my branch and build it locally. If this works out I’ll need to clean up the change a bit and we’ll likely apply it to all pages (or at least TaskRuns + PipelineRuns to start with).