falco: Falco 0.30.0 High CPU Usage when Meeting Tons of System Calls in Kubernetes

Describe the bug Hi,

I am running Falco 0.30.0 image in k8s, with eBPF enabled (built by falco-driver-loader). But Falco main process got a cpu usage 90~100% on singe core, and got event drop. The server had some video processing workload.

Has checked with #1403 for configurations, only grpc output enabled.

Try to get statistics with -s usage.log --stats-interval 1000 mode:

{"sample": 15, "cur": {"events": 35376590, "drops": 43392815, "preemptions": 0}, "delta": {"events": 1995998, "drops": 2494822, "preemptions": 0}, "drop_pct": 124.991},
{"sample": 16, "cur": {"events": 37481723, "drops": 45293543, "preemptions": 0}, "delta": {"events": 2105133, "drops": 1900728, "preemptions": 0}, "drop_pct": 90.2902},
{"sample": 17, "cur": {"events": 39556166, "drops": 46829396, "preemptions": 0}, "delta": {"events": 2074443, "drops": 1535853, "preemptions": 0}, "drop_pct": 74.0369},
{"sample": 18, "cur": {"events": 41569944, "drops": 49346024, "preemptions": 0}, "delta": {"events": 2013778, "drops": 2516628, "preemptions": 0}, "drop_pct": 124.97},
{"sample": 19, "cur": {"events": 43607914, "drops": 51849986, "preemptions": 0}, "delta": {"events": 2037970, "drops": 2503962, "preemptions": 0}, "drop_pct": 122.865},
{"sample": 20, "cur": {"events": 45649297, "drops": 54128675, "preemptions": 0}, "delta": {"events": 2041383, "drops": 2278689, "preemptions": 0}, "drop_pct": 111.625},
{"sample": 21, "cur": {"events": 47966987, "drops": 56916074, "preemptions": 0}, "delta": {"events": 2317690, "drops": 2787399, "preemptions": 0}, "drop_pct": 120.266},
{"sample": 22, "cur": {"events": 50015244, "drops": 58743901, "preemptions": 0}, "delta": {"events": 2048257, "drops": 1827827, "preemptions": 0}, "drop_pct": 89.2382},
{"sample": 23, "cur": {"events": 52102132, "drops": 61160832, "preemptions": 0}, "delta": {"events": 2086888, "drops": 2416931, "preemptions": 0}, "drop_pct": 115.815},
{"sample": 24, "cur": {"events": 54161066, "drops": 64385477, "preemptions": 0}, "delta": {"events": 2058934, "drops": 3224645, "preemptions": 0}, "drop_pct": 156.617},

Any suggestions to further check the performance issue?

If Falco is configured and runs normally here, will other workload cause such a huge CPU usage here? Seems the percentage is quite abnormal.

Resource:

...
limits:
  memory: "1024Mi"
  cpu: "1000m"
requests:
  memory: "512Mi"
  cpu: "100m"
...

Environment

Falco version: Falco version: 0.30.0 Driver version: 3aa7a83bf7b9e6229a3824e3fd1f4452d1e95cb4
System info: { “machine”: “x86_64”, “nodename”: “falco-pzrxf”, “release”: “5.4.0-80.90.2.el7.x86_64”, “sysname”: “Linux”, “version”: “#1 SMP Fri Aug 20 09:35:43 UTC 2021” }
Cloud provider or hardware configuration:
HOST OS: NAME=“CentOS Linux” VERSION=“7 (Core)” ID=“centos” ID_LIKE=“rhel fedora” VERSION_ID=“7” PRETTY_NAME=“CentOS Linux 7 (Core)” ANSI_COLOR=“0;31” CPE_NAME=“cpe:/o:centos:centos:7” HOME_URL=“https://www.centos.org/” BUG_REPORT_URL=“https://bugs.centos.org/”

CENTOS_MANTISBT_PROJECT=“CentOS-7” CENTOS_MANTISBT_PROJECT_VERSION=“7” REDHAT_SUPPORT_PRODUCT=“centos” REDHAT_SUPPORT_PRODUCT_VERSION=“7”

Kernel: Linux localdomain 5.4.0-80.90.2.el7.x86_64 #1 SMP Fri Aug 20 09:35:43 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Installation method: K8s 1.20.7 on CentOS

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 2
Comments: 22 (9 by maintainers)

Most upvoted comments

ok thank you, as said here https://github.com/falcosecurity/falco/issues/1403#issuecomment-1370826670

Basic k8s context info, such as the k8s.ns.name and k8s.pod.name included by default with -pk, don’t even strictly require the k8s api server communication from my knowledge. That info is optimized to be collected from the local container runtime running the pod, and so should be available even when -k is not enabled. This allows many users to have “some” k8s context info while also avoiding the extra performance stress of the api server communication

If you don’t need extra information from k8s enrichment, but you need only k8s.ns.name and k8s.pod.name, you could also disable the -k option, in this way drops should decrease, and also the CPU usage

Andreagit97 on Jan 10, 2023

/remove-lifecycle rotten

Any updates?

leogr on Nov 23, 2022