dashboard: Display failed shoot constraints
What would you like to be added:
In the shoot status, gardener already publishes the so called “constraints”, e.g.
status:
constraints:
- type: HibernationPossible
status: 'False'
lastTransitionTime: '2021-12-08T08:59:44Z'
lastUpdateTime: '2021-12-08T02:56:17Z'
reason: ProblematicWebhooks
message: >-
ValidatingWebhookConfiguration "opa-validating-webhook" is problematic:
webhook "validating-webhook.openpolicyagent.org" with failurePolicy
"Ignore" and 30s timeout might prevent worker nodes from properly
joining the shoot cluster
- type: MaintenancePreconditionsSatisfied
status: 'False'
lastTransitionTime: '2021-12-08T08:59:44Z'
lastUpdateTime: '2021-12-08T02:56:17Z'
reason: ProblematicWebhooks
message: >-
ValidatingWebhookConfiguration "opa-validating-webhook" is problematic:
webhook "validating-webhook.openpolicyagent.org" with failurePolicy
"Ignore" and 30s timeout might prevent worker nodes from properly
joining the shoot cluster
Also see documentation.
It would be good to prominently display failed constraints in the shoot’s details page.
Why is this needed:
Often times problematic webhook configurations and similar might be the cause for other problems in the cluster (e.g. worker nodes not joining the cluster), that are visible in the dashboard e.g. in the health checks.
- When operators start investigating such issues, it would be helpful to make them aware early on about the failed constraints, because it might speed up the process of investigation.
- When users notice such issues, they might be able to help themselves already by looking at the failed constraint’s messages.
/kind enhancement /area ops-productivity
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 28 (28 by maintainers)
Can you add the link to the best practices (https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#best-practices-and-warnings) so that the end-users have a chance to check what might be wrong?
As for the error code, please open an issue at g/g.
We could do something like this
Is this prominent enough? IDK… users already ignored the warning but maybe a red error with user error icon will help to make them aware. If we want this (or something similar) we need to talk about the texts as well as the implementation. But first let’s clarify if this is the direction we want to go.
Don’t get confused by the error message (Shoot cluster has been hibernated.) - I had no cluster with this error and I faked it.