falco: Falco 0.30.0 High CPU Usage when Meeting Tons of System Calls in Kubernetes

Describe the bug Hi,

I am running Falco 0.30.0 image in k8s, with eBPF enabled (built by falco-driver-loader). But Falco main process got a cpu usage 90~100% on singe core, and got event drop. The server had some video processing workload.

Has checked with #1403 for configurations, only grpc output enabled.

Try to get statistics with -s usage.log --stats-interval 1000 mode:

{"sample": 15, "cur": {"events": 35376590, "drops": 43392815, "preemptions": 0}, "delta": {"events": 1995998, "drops": 2494822, "preemptions": 0}, "drop_pct": 124.991},
{"sample": 16, "cur": {"events": 37481723, "drops": 45293543, "preemptions": 0}, "delta": {"events": 2105133, "drops": 1900728, "preemptions": 0}, "drop_pct": 90.2902},
{"sample": 17, "cur": {"events": 39556166, "drops": 46829396, "preemptions": 0}, "delta": {"events": 2074443, "drops": 1535853, "preemptions": 0}, "drop_pct": 74.0369},
{"sample": 18, "cur": {"events": 41569944, "drops": 49346024, "preemptions": 0}, "delta": {"events": 2013778, "drops": 2516628, "preemptions": 0}, "drop_pct": 124.97},
{"sample": 19, "cur": {"events": 43607914, "drops": 51849986, "preemptions": 0}, "delta": {"events": 2037970, "drops": 2503962, "preemptions": 0}, "drop_pct": 122.865},
{"sample": 20, "cur": {"events": 45649297, "drops": 54128675, "preemptions": 0}, "delta": {"events": 2041383, "drops": 2278689, "preemptions": 0}, "drop_pct": 111.625},
{"sample": 21, "cur": {"events": 47966987, "drops": 56916074, "preemptions": 0}, "delta": {"events": 2317690, "drops": 2787399, "preemptions": 0}, "drop_pct": 120.266},
{"sample": 22, "cur": {"events": 50015244, "drops": 58743901, "preemptions": 0}, "delta": {"events": 2048257, "drops": 1827827, "preemptions": 0}, "drop_pct": 89.2382},
{"sample": 23, "cur": {"events": 52102132, "drops": 61160832, "preemptions": 0}, "delta": {"events": 2086888, "drops": 2416931, "preemptions": 0}, "drop_pct": 115.815},
{"sample": 24, "cur": {"events": 54161066, "drops": 64385477, "preemptions": 0}, "delta": {"events": 2058934, "drops": 3224645, "preemptions": 0}, "drop_pct": 156.617},

Any suggestions to further check the performance issue?

If Falco is configured and runs normally here, will other workload cause such a huge CPU usage here? Seems the percentage is quite abnormal.

Resource:

...
limits:
  memory: "1024Mi"
  cpu: "1000m"
requests:
  memory: "512Mi"
  cpu: "100m"
...

Environment

  • Falco version: Falco version: 0.30.0 Driver version: 3aa7a83bf7b9e6229a3824e3fd1f4452d1e95cb4
  • System info: { “machine”: “x86_64”, “nodename”: “falco-pzrxf”, “release”: “5.4.0-80.90.2.el7.x86_64”, “sysname”: “Linux”, “version”: “#1 SMP Fri Aug 20 09:35:43 UTC 2021” }
  • Cloud provider or hardware configuration:
  • HOST OS: NAME=“CentOS Linux” VERSION=“7 (Core)” ID=“centos” ID_LIKE=“rhel fedora” VERSION_ID=“7” PRETTY_NAME=“CentOS Linux 7 (Core)” ANSI_COLOR=“0;31” CPE_NAME=“cpe:/o:centos:centos:7” HOME_URL=“https://www.centos.org/” BUG_REPORT_URL=“https://bugs.centos.org/

CENTOS_MANTISBT_PROJECT=“CentOS-7” CENTOS_MANTISBT_PROJECT_VERSION=“7” REDHAT_SUPPORT_PRODUCT=“centos” REDHAT_SUPPORT_PRODUCT_VERSION=“7”

  • Kernel: Linux localdomain 5.4.0-80.90.2.el7.x86_64 #1 SMP Fri Aug 20 09:35:43 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Installation method: K8s 1.20.7 on CentOS

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 2
  • Comments: 22 (9 by maintainers)

Most upvoted comments

ok thank you, as said here https://github.com/falcosecurity/falco/issues/1403#issuecomment-1370826670

Basic k8s context info, such as the k8s.ns.name and k8s.pod.name included by default with -pk, don’t even strictly require the k8s api server communication from my knowledge. That info is optimized to be collected from the local container runtime running the pod, and so should be available even when -k is not enabled. This allows many users to have “some” k8s context info while also avoiding the extra performance stress of the api server communication

If you don’t need extra information from k8s enrichment, but you need only k8s.ns.name and k8s.pod.name, you could also disable the -k option, in this way drops should decrease, and also the CPU usage

/remove-lifecycle rotten

Any updates?