kubeflow: API calls on centraldashboard become slow with a large number of profiles

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

We started seeing some slow startup time when a user loads the Kubeflow dashboard, and after some tracing tracked down a slow call to /api/workgroup/env-info.

After some investigation, it led us to the kfam service where obtaining role bindings takes 3-6 seconds:

GET /kfam/v1/bindings?user=$USER ReadBinding 6.064392368s

What did you expect to happen:

The call succeed quickly and not slow down with increased profile counts

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

We have approximately 123 profiles in our environment for our users. The speed of the call seems to have slowed down as the profile count increased.

Environment:

  • Kubeflow version: (version number can be found at the bottom left corner of the Kubeflow dashboard): 1.0.1
  • kfctl version: (use kfctl version): 1.0.1
  • Kubernetes platform: (e.g. minikube) AKS
  • Kubernetes version: (use kubectl version): 1.15.10
  • OS (e.g. from /etc/os-release): Ubuntu 16.04

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 5
  • Comments: 17 (5 by maintainers)

Most upvoted comments

@derekwyatt a PR is in-review: https://github.com/kubeflow/kubeflow/pull/5202 You can help by building an image from the PR’s code and testing if it fixes your issue.