gatekeeper: Violations not added to Constraint status

What steps did you take and what happened:

I’m not seeing violations in the Constraint status. Violations are logged by the audit Pods, but status.violations is missing from the Constraints.

$ kubectl get k8spspallowprivilegeescalationcontainer.constraints.gatekeeper.sh/gatekeeper-constraints-privilege-escalation -oyaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPAllowPrivilegeEscalationContainer
metadata:
  annotations:
    meta.helm.sh/release-name: gatekeeper-constraints
    meta.helm.sh/release-namespace: gatekeeper-system
  creationTimestamp: "2022-05-10T08:49:52Z"
  generation: 4
  labels:
    app.kubernetes.io/instance: gatekeeper-constraints
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: gatekeeper-constraints
    helm.sh/chart: gatekeeper-constraints-0.0.3
    helm.toolkit.fluxcd.io/name: gatekeeper-constraints
    helm.toolkit.fluxcd.io/namespace: gatekeeper-system
  name: gatekeeper-constraints-privilege-escalation
  resourceVersion: "197352789"
  uid: f1f16822-66c3-4908-957e-9ce089d15524
spec:
  enforcementAction: warn
  match:
    excludedNamespaces:
    - kube-system
    kinds:
    - apiGroups:
      - ""
      kinds:
      - Pod
status:
  byPod:
  - constraintUID: f1f16822-66c3-4908-957e-9ce089d15524
    enforced: true
    id: gatekeeper-audit-779d96b494-9m278
    observedGeneration: 4
    operations:
    - audit
    - mutation-status
    - status
...
$ kubectl logs -l app=gatekeeper,chart=gatekeeper,control-plane=audit-controller,gatekeeper.sh/operation=audit | grep violation_audited | grep warn
...
2022-07-15T13:39:28.703Z        info    controller      Privilege escalation container is not allowed: nginx    {"process": "audit", "audit_id": "2022-07-15T13:33:25Z", "details": {}, "event_type": "violation_audited", "constraint_group": "constraints.gatekeeper.sh", "constraint_api_version": "v1beta1", "constraint_kind": "K8sPSPAllowPrivilegeEscalationContainer", "constraint_name": "gatekeeper-constraints-privilege-escalation", "constraint_namespace": "", "constraint_action": "warn", "resource_group": "", "resource_api_version": "v1", "resource_kind": "Pod", "resource_namespace": "ingress-nginx", "resource_name": "nginx-privileged-disallowed"}
$ kubectl get constraint -o json | jq '.items[].status.violations'
null

What did you expect to happen:

violations in the Constraint status, ie.

...
status:
  auditTimestamp: "2019-05-11T01:46:13Z"
  enforced: true
  violations:
  - enforcementAction: deny
    kind: Namespace
    message: 'you must provide labels: {"gatekeeper"}'
    name: default

Anything else you would like to add:

--constraint-violations-limit=200
--audit-from-cache=false

otherwise installed with helm default values

Environment:

  • Gatekeeper version: 3.8.1
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:26:19Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.8-gke.202", GitCommit:"88deae00580af268497b9656f216cb092b630563", GitTreeState:"clean", BuildDate:"2022-06-03T03:27:52Z", GoVersion:"go1.16.14b7", Compiler:"gc", Platform:"linux/amd64"}

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 23 (10 by maintainers)

Most upvoted comments

One note about this debug session: webhook events do not show up in audit results reported on status, nor do they show up in the audit log (they should show up in the Prometheus metrics for the web server).

The eval_cancel_error is coming from the webhook and should have no bearing on audit results. What this error is denoting is that the context for the inbound request was canceled, which means either the API server canceled the query, or the query timed out. That may be worth digging into, but is not the question you are asking.

If you look at the logs for the audit pod (has audit in the name), do you see violations recorded there?

I have a similar issue - I created a resource that violates a constraint and I can’t see any violations on my constraint resource.

$ kubectl apply -f /tmp/kafkauser.yaml
Warning: [kafkatopiccontrolacls] operation: Describe not allowed
kafkauser.kafka.strimzi.io/msk-av-test created

$ kubectl get kafkatopiccontrolacls.constraints.gatekeeper.sh/kafkatopiccontrolacls -o yaml 
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: KafkaTopicControlACLs
status:
  auditTimestamp: '2023-02-20T10:18:24Z'
...
  totalViolations: 0

I can see that events for this resource are being emitted so it seems that the resource is being caught, but the constraint isn’t being updated nor can I see it reflected in the gatekeeper_violations metric

$ kubectl get events

7m40s       Warning   WarningAdmission    kafkauser/msk-av-test                                 Admission webhook "validation.gatekeeper.sh" raised a warning for this request, Resource Namespace: strimzi, Constraint: kafkatopiccontrolacls, Message: operation: Describe not allowed

Furthermore I do not see any evidence that existing violations are being picked up in the audit logs, despite the kubernetes namespace events reflecting that there are existing violations.

gatekeeper-audit-65bcd78db6-ptrbd manager {"level":"info","ts":1677149995.6172078,"logger":"controller","msg":"Auditing from cache","process":"audit","audit_id":"2023-02-23T10:59:53Z"}
gatekeeper-audit-65bcd78db6-ptrbd manager {"level":"info","ts":1677150099.3739018,"logger":"controller","msg":"Audit opa.Audit() results","process":"audit","audit_id":"2023-02-23T10:59:53Z","violations":0}