trivy-operator: No warnings or failed cis benchmarks

What steps did you take and what happened: I get different results for CIS benchmarks when I use kube-bench and the trivy operator.

When I run the built-in scanner of the trivy-operator for cis benchmarks I don’t get failed checks. All 116 checks are successful, although I know that one or the other should at least throw a warning.

Prometheus output via Servicemonitor

trivy_cluster_compliance{container="trivy-operator", description="CIS Kubernetes Benchmarks", endpoint="metrics",  job="trivy-operator", service="trivy-operator", status="Pass", title="CIS Kubernetes Benchmarks v1.23"} 116

Also in the clustercompliance resource I can see that no checks fail:

kubectl clustercompliancereports cis -o yaml 

When I run the kube-bench job in the same cluster and context, I get the expected warnings and errors for the cis benchmarks.

kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl kube-bench-<id> -f

What did you expect to happen: I expect to get the same warnings and errors for the CIS benchmarks. Regardless of which scanner I use for this.

HELM Values:

resources:
  limits:
    cpu: 150m
    memory: 100M
trivy:
  ignoreUnfixed: true
  mode: ClientServer
operator:
  configAuditScannerEnabled: false
  vulnerabilityScannerReportTTL: "86400s"
  vulnerabilityScannerScanOnlyCurrentRevisions: true
  rbacAssessmentScannerEnabled: false
  infraAssessmentScannerEnabled: false
  exposedSecretScannerEnabled: false
compliance:
  reportType: all
  # for test purposes currently set every minute 
  cron: "* * * * *"
serviceMonitor:
  enabled: true

Environment:

  • Trivy-Operator (HELM Cart version): 0.14.1
  • Trivy-Operator (Image version):ghcr.io/aquasecurity/trivy-operator:0.14.1
  • Kubernetes version: v1.23.7
  • OS: Linux

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 18

Most upvoted comments

@tibuntu thanks for your update, I’ll make sure docs are updated accordingly

That was a good idea it seems 😃

  1. Deleted the current trivy-operator release via ArgoCD
  2. Removed all CRDs
customresourcedefinition.apiextensions.k8s.io "vulnerabilityreports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "exposedsecretreports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "configauditreports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusterconfigauditreports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "rbacassessmentreports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "infraassessmentreports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusterrbacassessmentreports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "clustercompliancereports.aquasecurity.github.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusterinfraassessmentreports.aquasecurity.github.io" deleted
Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io "clusterconfigauditreports.aquasecurity.github.io" not found
  1. Installed the operator manually via helm
helm install trivy-operator --namespace security aqua/trivy-operator --version 0.14.1

Now several node-collector-* pods showed up, e.g.: node-collector-8b6cf59b6-zx46z

I see lots of infraAssesmentReports:

> kubectl get infraassessmentreport -A
NAMESPACE     NAME                                                         SCANNER   AGE
kube-system   cronjob-backup-etcd                                          Trivy     91s
kube-system   pod-kube-apiserver-k8s-cp-01.<redact>            Trivy     105s
kube-system   pod-kube-apiserver-k8s-cp-02.<redact>            Trivy     2m14s
kube-system   pod-kube-apiserver-k8s-cp-03.<redact>            Trivy     2m6s
kube-system   pod-kube-controller-manager-k8s-cp-01.<redact>   Trivy     95s
kube-system   pod-kube-controller-manager-k8s-cp-02.<redact>   Trivy     101s
kube-system   pod-kube-controller-manager-k8s-cp-03.<redact>   Trivy     2m20s
kube-system   pod-kube-scheduler-k8s-cp-01.<redact>            Trivy     91s
kube-system   pod-kube-scheduler-k8s-cp-02.<redact>            Trivy     115s
kube-system   pod-kube-scheduler-k8s-cp-03.<redact>            Trivy     110s
kube-system   service-kube-prometheus-stack-kube-scheduler                 Trivy     95s

same for clusterinfraassessmentreports:

> k get clusterinfraassessmentreports.aquasecurity.github.io -A
NAME                                      SCANNER   AGE
node-k8s-cp-01.<redact>       Trivy     86s
node-k8s-cp-02.<redact>       Trivy     2m33s
node-k8s-cp-03.<redact>       Trivy     2m9s
node-k8s-worker-01.<redact>   Trivy     102s
node-k8s-worker-02.<redact>   Trivy     2m27s
node-k8s-worker-03.<redact>   Trivy     119s

And best thing is, the CIS scan has proper results:

status:
  summary:
    failCount: 36
    passCount: 80

I am going to check if there is any difference on the trivy-operator deployment / configmaps / secrets that I can find that explains the different behaviour.

@yanehi can this issue be closed ?

Ok so here is the final result.

I kept deleting the Helm-Release via ArgoCD, also always deleted the CRDs, and then re-deployed with several, adapted values.

Got it working now with the following values:

        resources:
          limits:
            cpu: 500m
            memory: 1Gi
        excludeNamespaces: "kube-system,kube-public,kube-node-lease,ingress-nginx"
        trivy:
          ignoreUnfixed: true
          serverURL: https://some-url
          mode: ClientServer
        operator:
          configAuditScannerEnabled: true
          vulnerabilityScannerReportTTL: "86400s"
          vulnerabilityScannerScanOnlyCurrentRevisions: true
          rbacAssessmentScannerEnabled: false
          infraAssessmentScannerEnabled: true
          exposedSecretScannerEnabled: false
        compliance:
          reportType: all
          cron: "0 12 * * 1"
        serviceMonitor:
          enabled: true

Setting configAuditScannerEnabled: true was the key factor here.

I am unsure if the current documentation reflects that requirement for proper clustercompliancereports?

Anyways, big thanks for your help and fast replies!

For some reason it simply doesn’t exist:

> kubectl get clusterinfraassessmentreport -o wide
No resources found
> kubectl get clusterinfraassessmentreport
No resources found

same for the infraassesmentreport

> kubectl get infraassessmentreport -A
No resources found

you should have something, it explain why compliance report do no show infra data

btw: are you running on managed cluster (cloud) or on prem?

I suggest to do another simple test , perform the following , lets clean up env and start fresh

  1. uninstall trivy-operator via helm
  2. delete all CRD
kubectl delete crd vulnerabilityreports.aquasecurity.github.io
kubectl delete crd exposedsecretreports.aquasecurity.github.io
kubectl delete crd configauditreports.aquasecurity.github.io
kubectl delete crd clusterconfigauditreports.aquasecurity.github.io
kubectl delete crd rbacassessmentreports.aquasecurity.github.io
kubectl delete crd infraassessmentreports.aquasecurity.github.io
kubectl delete crd clusterrbacassessmentreports.aquasecurity.github.io
kubectl delete crd clustercompliancereports.aquasecurity.github.io
kubectl delete crd clusterinfraassessmentreports.aquasecurity.github.io
kubectl delete crd clusterconfigauditreports.aquasecurity.github.io
  1. install operator with default values

@chen-keinan

This is the summary of the kube-bench job:

== Summary policies ==
0 checks PASS
0 checks FAIL
30 checks WARN
0 checks INFO

== Summary total ==
21 checks PASS
4 checks FAIL
35 checks WARN
0 checks INFO

One of the failing checks is:

[FAIL] 2.2 Ensure that the --client-cert-auth argument is set to true (Automated)

The same check passes in the report generated by trivy-operator:

    - id: "2.2"
      name: Ensure that the --client-cert-auth argument is set to true
      severity: CRITICAL
      totalFail: 0

this is the CIS scan summary of the trivy-operator:

status:
  summary:
    passCount: 116

If I am not mistaken we had infraAssessmentScannerEnabled set to true at some point. But it caused no difference.

Edit:

I checked the environment variables set in the trivy-operator deployment.

        - name: OPERATOR_INFRA_ASSESSMENT_SCANNER_ENABLED
          value: "true"

IIRC I saw another issue in this repo where someone stated that setting the helm value to false doesn’t have any effect on the env vars being set on the deployment.