kubernetes: [Flaky test] Advanced Audit tests are flaky in 1.14-blocking
Which jobs are flaking: https://testgrid.k8s.io/sig-release-1.14-blocking#gce-cos-1.14-default
Which test(s) are flaking:
[sig-auth] Advanced Audit [DisabledForLargeClusters] should audit API calls to create, get, update, patch, delete, list, watch secrets.[sig-auth] Advanced Audit [DisabledForLargeClusters] should audit API calls to create, get, update, patch, delete, list, watch configmaps.
Reasons for flaking: Seem to be mostly timeouts, for example: 9538, 9543, 9544.
Since when has it been flaking: Since 2/20.
Testgrid link: https://testgrid.k8s.io/sig-release-1.14-blocking#gce-cos-1.14-default
Anything else we need to know: Latest failure logs can be found in https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-cos-k8sbeta-default/9544.
cc @mortent @kacole2 @mariantalla @alejandrox1
/kind flaky-test /priority important-soon /sig auth
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 19 (18 by maintainers)
the methodology the test is using to scan for audit events is inherently flaky in the face of server-side log rotation
spoke with @tallclair about this, and I think we should do the following:
[flaky]while we resolve issuesOnce DynamicAudit goes to beta, we should be able to eliminate the reliance on the log files by using a webhook to verify the audit stream instead. Thanks for investigating @pbarker !