falco: Runtime error: Socket handler (k8s_pod_handler_state): read more than 30MB of data

What happened: We observed an error log under a large scale load test.

Wed Aug 21 02:21:40 2019: Runtime error: Socket handler (k8s_pod_handler_state): read more than 30MB of data from https://10.40.48.1/api/v1/pods?fieldSelector=status.phase!=Failed,status.phase!=Unknown,status.phase!=Succeeded&pretty=false (31463287 bytes, 2162 reads). Giving up. Exiting.

What you expected to happen: Falco should work in a large scale Kubernetes cluster.

How to reproduce it (as minimally and precisely as possible): Use a large Kubernetes cluster with > 1k node and > 10k PODs.

Anything else we need to know?:

Environment:

  • Falco version (use falco --version): falco version 0.1.2736dev

  • System info <!-- Falco has a built-in support command you can use "falco --support |

jq .system_info" -->
{
  "machine": "x86_64",
  "nodename": "falco-hgq5k",
  "release": "4.15.0-1034-gke",
  "sysname": "Linux",
  "version": "#36-Ubuntu SMP Thu Jun 6 11:16:45 UTC 2019"
}
Linux falco-hgq5k 4.15.0-1034-gke #36-Ubuntu SMP Thu Jun 6 11:16:45 UTC 2019 x86_64 GNU/Linux
  • Install tools (e.g. in kubernetes, rpm, deb, from source): Kubernetes

  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 4
  • Comments: 38 (13 by maintainers)

Most upvoted comments

A fix is in the working. See https://github.com/falcosecurity/libs/pull/40

In the meantime, I’m going to prepare a PR here on Falco to let you configure the limits.

so these two patches already merged ?

They are both included into libs since commit https://github.com/falcosecurity/libs/commit/f7029e2522cc4c81841817abeeeaa515ed944b6c

On Falco side, there’re two PRs - not yet merged - that integrate these patches:

The first (#1667) only includes the https://github.com/falcosecurity/libs/pull/40 and allows the user to configure the limit. The latter (#1671) includes both patches (since it uses the latest libs commit). So, if you wanted to try #1671 you will get both the limit raised to 100MB (the new default value) and the node filtering.

PS I apologize for all these complications and delays, but it was necessary to attack the problem from different angles. Which required more PRs. If you need any support to test these patches, feel free to contact me on Slack. I will be happy to help.

I run a decently sized k8s cluster, around 2600 pods. When installing Falco, and its DaemonSet pods came online for the first time, they all began reporting this socket error due to the 30MB fixed size limit. I’m unable to proceed any further with Falco and I’m forced to consider alternatives such as Wazuh.

I feel like I’m in Bizarro World, when I see a project like Falco incubated by the CNCF, marketed as the best choice for k8s security analysis, and yet there’s a hard-coded 30MB socket limit which gets breached by a relatively small cluster of 2600 pods?!

This tells me Falco hasn’t been proven in any sort of decently-sized cluster. Unless there’s some other variable here that I’m missing, in which case I would love for a maintainer or community contributor to politely whack me upside the head and tell me why I’m wrong.

I’ll bring this up for discussion in the community call. This needs a proposal to find a solution. @lorenzo-biava @djsly you’re more than welcome to join to share your findings.

Another update: I’m working on https://github.com/falcosecurity/libs/pull/49, which should definitively address this issue (at least, I hope 😸 ) in the way described by https://github.com/falcosecurity/libs/issues/43.

@leodido Hello, I updated the information with issues template format.