bpftrace: Can't use subprograms from multiple probes

The BPF verifier requires that all defined subprograms are used by the main program.

Our current approach involves appending all subprograms into .text, whether they are used by a given probe or not. This means that if a script contains multiple probes, each with their own subprograms, the verifier will reject it.

What reproduces the bug?

BEGIN
{
  @map[0] = 1;
  print(len(@map))
}
END
{
  print(len(@map))
}
Attaching 2 probes...

Error log: 
number of funcs in func_info doesn't match number of subprogs
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

ERROR: Error loading program: BEGIN

The easy fix for len() would be to simply merge duplicate subprograms into a single len() subprogram. However, we will also need to implement a proper fix for #2955.

libbpf encountered this same problem previously: https://lore.kernel.org/netdev/5f4f311d37ba8_5f3982088@john-XPS-13-9370.notmuch/T/

About this issue

  • Original URL
  • State: open
  • Created 4 months ago
  • Comments: 18 (16 by maintainers)

Commits related to this issue

Most upvoted comments

@viktormalik Yes, let me give it a shot this week. I wanna get better at this part of the codebase. I’ll call for help as needed 📞

if you are using libbpf to load ELF file

This is currently WIP but it will take some time to land so we wanted to workaround this issue in the meantime.

Can someone summarize what’s being done by bpftrace vs libbpf, so we can pin down where things go wrong?

Sure, the current (simplified) bpftrace workflow is the following:

  1. Generate ELF in which probes go into custom sections and subprograms go into .text.
  2. Use bpf_object__open_mem for the ELF. This is currently used only to discover maps in the program.
  3. For each probe, take the corresponding custom section and the entire .text and pass it to bpf_prog_load. This is where the error from the issue description happens as .text may contain subprogs for other probes, too.

The plan is to replace point 3 by calling bpf_object__load which should resolve the problem but as I mentioned, it’s not a straightforward thing to do and will take some time. For the meantime, my idea was to store subprogs in per-probe sections (e.g. .text.<probename> and transform them into .text for each probe in point 3 but bpf_object__open_mem doesn’t seem to like such sections.

Is there anything I can do to help unblock libbpf-based loading?

Not at the moment. It’s a lot of work since we need to significantly rework the way we do probe expansion. I believe I’m finally on the right track so some preparation PRs should appear next week. Then, your reviewing help will be much appreciated 😃

Damn, yeah, I forgot that we’re loading the created ELF with libbpf now: https://github.com/bpftrace/bpftrace/blob/master/src/bpfbytecode.cpp#L15

This may be quite tricky to do, I’m wondering if it is possible to tell libbpf to ignore certain sections. Maybe we could mark them as not containing instructions? This is still a temporary workaround until full libbpf-based loading lands.

e.g. .text. and then transform the relevant section into .text for the given probe in BpfProgram.

QQ: should it be probe id? I think you can have multiple of the same probe name right (different preds) ?

I can try to take a look a fixing this issue if folks point me in the right direction 😃

I’d use the section name for the given probe (with the .text prefix), have a look at the get_section_name_for_probe function.

The BPF verifier requires that all defined subprograms are used by the main program.

Our current approach involves appending all subprograms into .text, whether they are used by a given probe or not. This means that if a script contains multiple probes, each with their own subprograms, the verifier will reject it.

Ah, so we would need to copy just the required symbols, not the entire .text. That sounds tricky as there may be relative offsets which would have to be fixed.

@viktormalik How much of the BpfProgram logic will remain when we complete the libbpf migration? Do you think libbpf will be able to solve this issue for us?

Yes, it will solve the issue completely as all of the relocations code will go away. I’ll put a higher priority on that work.