bpftrace: Print warning when kretprobe's fail to fire
Sometimes kretprobe’s do not fire, depending on how many threads are currently “inside” the traced function, and the maxactive setting. The kernel tracks how many firings are missed. It would be helpful if bpftrace would print some sort of warning when these events are dropped.
https://www.kernel.org/doc/Documentation/kprobes.txt :
While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling register_kretprobe(), the user sets the maxactive field of the kretprobe struct to specify how many instances of the specified function can be probed simultaneously. register_kretprobe() pre-allocates the indicated number of kretprobe_instance objects.
For example, if the function is non-recursive and is called with a spinlock held, maxactive = 1 should be enough. If the function is non-recursive and can never relinquish the CPU (e.g., via a semaphore or preemption), NR_CPUS should be enough. If maxactive <= 0, it is set to a default value. If CONFIG_PREEMPT is enabled, the default is max(10, 2*NR_CPUS). Otherwise, the default is NR_CPUS.
It’s not a disaster if you set maxactive too low; you’ll just miss some probes. In the kretprobe struct, the nmissed field is set to zero when the return probe is registered, and is incremented every time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe.
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 1
- Comments: 21 (7 by maintainers)
Commits related to this issue
- perf/kprobe: maxactive for fd-based kprobe Enable specifying maxactive for fd based kretprobe. This will be useful for tracing tools like bcc and bpftrace (see for example discussion [1]). Use highes... — committed to intel-lab-lkp/linux by erthalion 2 years ago
- perf/kprobe: maxactive for fd-based kprobe Enable specifying maxactive for fd based kretprobe. This will be useful for tracing tools like bcc and bpftrace (see for example discussion [1]). Use highes... — committed to intel-lab-lkp/linux by erthalion 2 years ago
- perf/kprobe: maxactive for fd-based kprobe Enable specifying maxactive for fd based kretprobe. This will be useful for tracing tools like bcc and bpftrace (see for example discussion [1]). Use highes... — committed to intel-lab-lkp/linux by liu-song-6 2 years ago
- perf/kprobe: maxactive for fd-based kprobe Enable specifying maxactive for fd based kretprobe. This will be useful for tracing tools like bcc and bpftrace (see for example discussion [1]). Use highes... — committed to kernel-patches/bpf by liu-song-6 2 years ago
- perf/kprobe: maxactive for fd-based kprobe Enable specifying maxactive for fd based kretprobe. This will be useful for tracing tools like bcc and bpftrace (see for example discussion [1]). Use highes... — committed to kernel-patches/bpf-rc by liu-song-6 2 years ago
- perf/kprobe: maxactive for fd-based kprobe Enable specifying maxactive for fd based kretprobe. This will be useful for tracing tools like bcc and bpftrace (see for example discussion [1]). Use highes... — committed to kernel-patches/bpf-rc by liu-song-6 2 years ago
- perf/kprobe: maxactive for fd-based kprobe Enable specifying maxactive for fd based kretprobe. This will be useful for tracing tools like bcc and bpftrace (see for example discussion [1]). Use highes... — committed to kernel-patches/bpf by liu-song-6 2 years ago
Example output today. Workload is random reads from 4x as many threads as there are CPU’s. The kretprobe fires 1/4 as many times as the kprobe.
I’d like for
bpftraceto additionally output something likeAh sorry, notification must have fallen into the void.
I sent a couple versions but then lost track of what PeterZ was talking about. Unfortunately ran out of time to figure it out. I believe this is the most up to date version here: https://github.com/danobi/linux/commits/perf_read_format_missed if anyone has time to pick it up.
I’ve posted the second attempt: https://lore.kernel.org/bpf/20190917133056.5545-1-dxu@dxuuu.xyz/T/
We have 31 bits left in perf_event_attr.config, so that’s not a problem.
The change should be straightforward. I will get something for test soon.
Relatedly, is there a way to specify
maxactive? If not, that would be a nice feature to have. It’s important to set this correctly if you are using bpftrace to measure performance of heavily multi-threaded workloads.