kiali: Multi-cluster discovery leads to time out while fetching server configs, blocking access
I installed kiali 1.33.1 and it is not loading on one cluster in nonprod and prod, but it works on every other cluster (although sometimes slow login times). This was same experience with 1.33.0 as well.
When I revert back to 1.32.0 it works.
The browser says:
You are logged in, but there was a problem when fetching some required server configurations. Please, try refreshing the page.
All I see in the logs are
2021-04-26T19:00:50Z INF Not handling OpenId code flow authentication: No nonce code present. Login window timed out.
Kiali server helm values
auth:
openid:
client_id: [redacted]
disable_rbac: true
issuer_uri: [redacted]
username_claim: email
strategy: openid
deployment:
affinity:
pod_anti:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- prometheus
topologyKey: topology.kubernetes.io/zone
weight: 100
ingress_enabled: false
node_selector:
cloud.google.com/gke-nodepool: monitoring
pod_anti:
hack: true
view_only_mode: true
external_services:
grafana:
auth:
password: [redacted]
type: basic
username: admin
in_cluster_url: http://prometheus-stack-core-grafana:80/
url: https://core-forbes-development.grafana.forbes.com
prometheus:
url: http://core-prometheus:9090/
fullnameOverride: kiali-core
istio_namespace: istio-system
nameOverride: kiali-core
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 30 (13 by maintainers)
Commits related to this issue
- Reduce number of API calls to discover remote Kialis A "walk" through all remote cluster namespaces was being done to discover remote Kiali instances (that is, one query per namespace, per cluster). ... — committed to israel-hdez/swscore by israel-hdez 3 years ago
- Reduce number of API calls to discover remote Kialis (#3970) * Reduce number of API calls to discover remote Kialis A "walk" through all remote cluster namespaces was being done to discover remote... — committed to kiali/kiali by israel-hdez 3 years ago
- Reduce number of API calls to discover remote Kialis (#3970) * Reduce number of API calls to discover remote Kialis A "walk" through all remote cluster namespaces was being done to discover remote... — committed to israel-hdez/swscore by israel-hdez 3 years ago
Hello @rwong2888 Kiali v1.34.1 has been released. It should contain the fix you need.
Please, try it and tell us if it fixes the login issue.
We can’t commit if it will be merged today, but in any case it will be cherry-picked in case that it doesn’t make it for 1.34.0 but it will be available on a 1.34.1 if that’s the case.
I think for the release, it’s possible to have a proper fix, rather than just extending the timeout.
I was mentioning a possible workaround in case it was possible to adjust some settings, so that you can use version 1.33 straightaway. But Kiali has these timeouts hard-coded. So, since a release is needed anyway, I think it’s better to do a proper fix rather than a workaround.
Kiali is doing a “walk” on each namespace on each “remote” cluster (skipping the local one) to discover other Kiali instances. I guess the cluster where Kiali works OK is the one with the greater number of namespaces and the rest of the clusters have less namespaces. As this Kiali has a smaller number of remote namespaces to walk through, I guess it can finish on time; while the other Kiali instance can’t because if the larger list it needs to “walk”.