netbird: Keycloak idp timeout
We’re running a large keycloak instance utilizing federation with a ldap directory.
Following your instructions we created a frontend client and a backend client. Using an empty realm without federation works, using our regular realm with federation causes timeouts.
Logs show:
infrastructure_files-dashboard-1 | *** - - [13/Dec/2023:16:35:14 +0000] "GET /peers HTTP/1.1" 304 0 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/111.0" "-"
infrastructure_files-dashboard-1 | *** - - [13/Dec/2023:16:35:14 +0000] "GET /static/js/main.643f6421.js HTTP/1.1" 304 0 "https://netbird.***.**/peers" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/111.0" "-"
infrastructure_files-dashboard-1 | *** - - [13/Dec/2023:16:35:14 +0000] "GET /static/media/bars.460b15c2eff2efb309cd0df6df541052.svg HTTP/1.1" 200 356 "https://netbird.***.**/peers" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/111.0" "-"
infrastructure_files-management-1 | 2023-12-13T16:35:14Z INFO management/server/account.go:1518: overriding JWT Domain and DomainCategory claims since single account mode is enabled
infrastructure_files-management-1 | 2023-12-13T16:35:24Z ERRO management/server/http/middleware/access_control.go:46: failed to get user from claims: failed to get account with token claims context deadline exceeded (Client.Timeout or context cancellation while reading body)
infrastructure_files-management-1 | 2023-12-13T16:35:24Z ERRO management/server/http/util/util.go:80: got a handler error: invalid JWT
infrastructure_files-management-1 | 2023-12-13T16:35:24Z ERRO management/server/telemetry/http_api_metrics.go:181: HTTP response 3095261566: GET /api/users status 401
I’m not sure if it’s good to fetch all users to keep the accounts synchronized all the time.
If there’s another way to verify just that single user on login, it could probably solve this issue.
Adding @kbudde for updates.
About this issue
- Original URL
- State: open
- Created 7 months ago
- Reactions: 1
- Comments: 15 (3 by maintainers)
Thanks, @max06 ; from your logs, it takes around 24s to get 200 users from the keycloak instance. This might give us around 96s for 4 requests, assuming 720 entries.
relevant logs:
We will discuss this option and get back to you. In the mean time, it seems that there was a progress with a PR fixing this from keycloak side: https://github.com/keycloak/keycloak/pull/19342
@max06 could you please use this branch to perform some tests? https://github.com/netbirdio/netbird/commits/debug-keycloak-idp/
You need to have the flag
--log-level debugset when running the management service. Please share the duration you’ll get inKeycloak totalUsersCount took %d ms to handledebug log messageHello @max06 , thanks for sharing the results of your test.
Keycloak has an issue open because of a similar case: https://github.com/keycloak/keycloak/issues/10005
The problem seems to be related to keycloak issuing a ldap search when we call the /users API.
We can increase the timeout and possibly make the API calls use pagination, but we need to evaluate that.
As we don’t have a testing environment with such scale and you already built a custom version. Would run some tests for us?