libs: Unexpected number of processors (on musl build)

On a system with processors disabled, Falco 0.33.1 fails to properly detect and enumerate the correct number of online processors, if any but the last processor is disabled. This causes Falco to exit with a fatal error.

How to reproduce it

  1. Stop Falco.
  2. On a system with multiple processors, disable any processor excepting the last. For example, on a 4 processor system, disable cpu1 or cpu2. On an 8 processor system, disable cpu1, cpu2, cpu3, cpu4, cpu5 or cpu6. Disable the processor using the command echo 0 > /sys/devices/system/cpu/cpu3/online.
  3. Start Falco. Observe that it fails to start with the message Error: processors online: 6, expected: 7.
  4. Stop Falco. Repeat with another CPU.

Expected behaviour

Libs successfully detects and enumerates all processors and Falco starts successfully.

Screenshots

N/A

Environment

  • Falco version:

Falco version: 0.33.1 Libs version: 0.9.2 Plugin API: 2.0.0 Driver: API version: 2.0.0 Schema version: 2.0.0 Default driver: 3.0.1+driver

  • System info:

Wed Feb 1 14:45:50 2023: Falco version: 0.33.1 (x86_64) Wed Feb 1 14:45:50 2023: Falco initialized with configuration file: /etc/falco/falco.yaml Wed Feb 1 14:45:50 2023: Loading rules from file /etc/falco/rules.d/common.yaml { “machine”: “x86_64”, “nodename”: “pod-170255”, “release”: “5.4.0-1086-gcp”, “sysname”: “Linux”, “version”: “#94~18.04.1-Ubuntu SMP Fri Aug 5 18:26:39 UTC 2022” }

  • Cloud provider or hardware configuration: Google Compute Engine
  • Kernel: Linux pod-170255 5.4.0-1086-gcp #94~18.04.1-Ubuntu SMP Fri Aug 5 18:26:39 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Installation method: DEB (custom)

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 45 (32 by maintainers)

Most upvoted comments

Agree with Leo! Thank you very much Ian, you digged into this relentlessly and provided lots of information to everyone else! Great job!

I agree that as first we have document this. Even if it’s an edge case, I would write a note in the official documentation.

Also, I want to thank you’ll for the awesome job in debugging this issue. It’s really impressive.

I’ve got a path forward now. I’ll leave it to you to determine what, if anything, you want to do here. My suggestion is that, at a minimum, a known issue is documented for this so anyone else who runs into it in the future don’t waste their time. It’s an edge case to be sure, but one we know about now.

I fully agree. /cc @leogr any idea?

OK, so this is wild, check this out.

I found this article about a potential bug with musl where the codepath for _SC_NPROCESSORS_CONF actually pulls data for ONLN instead. http://www.landley.net/notes-2022.html#26-07-2022

I created this sample code to test:

#include <stdlib.h>
#include <stdio.h>
#include <sys/sysinfo.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
	printf("get_nprocs_conf(): %d\nget_nprocs(): %d\nsysconf(_SC_NPROCESSORS_CONF): %ld\nsysconf(_SC_NPROCESSORS_ONLN): %ld\n",
	get_nprocs_conf(), get_nprocs(), sysconf(_SC_NPROCESSORS_CONF), sysconf(_SC_NPROCESSORS_ONLN));
	exit(EXIT_SUCCESS);
}

When I use gcc to compile and run, I get this:

get_nprocs_conf(): 8
get_nprocs(): 6
sysconf(_SC_NPROCESSORS_CONF): 8
sysconf(_SC_NPROCESSORS_ONLN): 6

When I use musl-gcc to compile and run, I get this:

get_nprocs_conf(): 6
get_nprocs(): 6
sysconf(_SC_NPROCESSORS_CONF): 6
sysconf(_SC_NPROCESSORS_ONLN): 6

You can see the issue here, where the case for JT_NPROCESSORS_CONF simply flows into the case for JT_NPROCESSORS_ONLN: https://git.musl-libc.org/cgit/musl/tree/src/conf/sysconf.c#n202

Are you using musl? If so, then I think the solution here would be to gather this information by looking at /sys/devices/system/cpu/cpuX.

That #721 was rolled into 0.33.1. Or at least it was supposed to have been. https://falco.org/blog/falco-0-33-1/

I haven’t tried it with kmod, just bpf. I’m using only bpf in my environment.