bpftrace: Print warning when kretprobe's fail to fire

Sometimes kretprobe’s do not fire, depending on how many threads are currently “inside” the traced function, and the maxactive setting. The kernel tracks how many firings are missed. It would be helpful if bpftrace would print some sort of warning when these events are dropped.

https://www.kernel.org/doc/Documentation/kprobes.txt :

While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling register_kretprobe(), the user sets the maxactive field of the kretprobe struct to specify how many instances of the specified function can be probed simultaneously. register_kretprobe() pre-allocates the indicated number of kretprobe_instance objects.

For example, if the function is non-recursive and is called with a spinlock held, maxactive = 1 should be enough. If the function is non-recursive and can never relinquish the CPU (e.g., via a semaphore or preemption), NR_CPUS should be enough. If maxactive <= 0, it is set to a default value. If CONFIG_PREEMPT is enabled, the default is max(10, 2*NR_CPUS). Otherwise, the default is NR_CPUS.

It’s not a disaster if you set maxactive too low; you’ll just miss some probes. In the kretprobe struct, the nmissed field is set to zero when the return probe is registered, and is incremented every time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe.

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Reactions: 1
  • Comments: 21 (7 by maintainers)

Commits related to this issue

Most upvoted comments

Example output today. Workload is random reads from 4x as many threads as there are CPU’s. The kretprobe fires 1/4 as many times as the kprobe.

$ sudo bpftrace -e 'kretprobe:zpl_iter_read,kprobe:zpl_iter_read{@[probe]=count();}' -c 'sleep 10'
Attaching 2 probes...


@[kretprobe:zpl_iter_read]: 263207
@[kprobe:zpl_iter_read]: 1037071

I’d like for bpftrace to additionally output something like

kretprobe:zpl_read_iter failed to fire 750,000 times due to insufficient "maxactive".
The "maxactive" setting can be increased by <insert instructions here>.

Ah sorry, notification must have fallen into the void.

I sent a couple versions but then lost track of what PeterZ was talking about. Unfortunately ran out of time to figure it out. I believe this is the most up to date version here: https://github.com/danobi/linux/commits/perf_read_format_missed if anyone has time to pick it up.

We have 31 bits left in perf_event_attr.config, so that’s not a problem.

The change should be straightforward. I will get something for test soon.

Relatedly, is there a way to specify maxactive? If not, that would be a nice feature to have. It’s important to set this correctly if you are using bpftrace to measure performance of heavily multi-threaded workloads.