inspektor-gadget: trace exec: `cannot create bpf perf link: permission denied` on RHEL 9.3
Description
Using the trace exec gadget fails on RHEL 9.3 with default kernel:
$ uname -a
Linux localhost.localdomain 5.14.0-362.18.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jan 3 15:54:45 EST 2024 x86_64 x86_64 x86_64 GNU/Linux
Impact
Our customers cannot use our solution based on IG on RHEL 9 nodes.
Environment and steps to reproduce
- Set-up: download and install RHEL 9.3 server (https://developers.redhat.com/products/rhel/download - requires a FREE developer subscription).
- Task: try to run
sudo ig trace exec - Action(s): it is also possible to simply run exec.go
a. clone IG repo
b. compile trace/exec
c. run with
sudo - Error:
[matthias@localhost ~]$ git clone https://github.com/inspektor-gadget/inspektor-gadget.git
Cloning into 'inspektor-gadget'...
remote: Enumerating objects: 40924, done.
remote: Counting objects: 100% (1875/1875), done.
remote: Compressing objects: 100% (939/939), done.
remote: Total 40924 (delta 1130), reused 1467 (delta 881), pack-reused 39049
Receiving objects: 100% (40924/40924), 83.34 MiB | 15.11 MiB/s, done.
Resolving deltas: 100% (25637/25637), done.
[matthias@localhost ~]$ cd inspektor-gadget/examples/gadgets/basic/trace/exec/
[matthias@localhost exec]$ ls
exec.go README.md
[matthias@localhost exec]$ go build .
go: downloading github.com/cilium/ebpf v0.12.3
go: downloading golang.org/x/exp v0.0.0-20231108232855-2478ac86f678
go: downloading golang.org/x/text v0.14.0
go: downloading go.opentelemetry.io/otel v1.22.0
go: downloading golang.org/x/sys v0.16.0
go: downloading github.com/hashicorp/go-multierror v1.1.1
go: downloading github.com/sirupsen/logrus v1.9.3
go: downloading golang.org/x/term v0.16.0
go: downloading github.com/hashicorp/errwrap v1.1.0
go: downloading github.com/coreos/go-systemd/v22 v22.5.0
go: downloading github.com/godbus/dbus/v5 v5.1.0
go: downloading github.com/google/uuid v1.6.0
go: downloading github.com/spf13/cobra v1.8.0
go: downloading github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635
go: downloading github.com/spf13/pflag v1.0.5
[matthias@localhost exec]$ sudo ./exec
[sudo] password for matthias:
error creating tracer: attaching exit tracepoint: cannot create bpf perf link: permission denied
Expected behavior
Trace execs without an error.
Additional information
Slack thread: https://kubernetes.slack.com/archives/CSYL75LF6/p1706873942198409
About this issue
- Original URL
- State: closed
- Created 5 months ago
- Reactions: 1
- Comments: 20 (18 by maintainers)
Commits related to this issue
- gadgets: Replace context structs for tracepoint/syscall programs Tracepoint syscall programs should use `struct syscall_trace_(enter|exit)` instead of `struct trace_event_raw_sys_(enter|exit)`. See... — committed to inspektor-gadget/inspektor-gadget by mauriciovasquezbernal 4 months ago
- gadgets: Replace context structs for tracepoint/syscall programs Tracepoint syscall programs should use `struct syscall_trace_(enter|exit)` instead of `struct trace_event_raw_sys_(enter|exit)`. See... — committed to inspektor-gadget/inspektor-gadget by mauriciovasquezbernal 4 months ago
- gadgets: Replace context structs for tracepoint/syscall programs Tracepoint syscall programs should use `struct syscall_trace_(enter|exit)` instead of `struct trace_event_raw_sys_(enter|exit)`. See... — committed to inspektor-gadget/inspektor-gadget by mauriciovasquezbernal 4 months ago
I found the root cause and opened a merge request on their gitlab: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3717
@mauriciovasquezbernal @eiffel-fl your summaries are correct.
9.3, being an odd minor release has a short lifecycle and is reaching it’s EndOfLife soon, so unfortunately we are not likely to backport the fix there.
Starting with 9.3 Red Hat no longer maintains a separate tree for rt kernels, that means rt patches are applied to the main tree and both rt and non-rt kernels are built from the same sources only differing in configs.
Hi there,
I worked on this problem a while ago and to fix this properly some changes will be needed on inspektor-gadget’s side.
The upstream kernel fix can be found here: https://lore.kernel.org/lkml/20231005123413.GA488417@alecto.usersys.redhat.com/t/#u. This fix is included in rhel9.4, but inspektor-gadget would still fail even with newer kernels. When using syscall tracepoints in bpf programs what the program is really getting is
struct syscall_trace_(enter|exit)which happened to have the same offsets for args member asstruct trace_event_raw_sys_(enter|exit)so any of these structs could be used, but withstruct trace_entrychanged this is no longer the case.I’ve looked at bpf programs inspektor-gadget uses and it seems like all of them use
trace_event_raw_sys_*structs, which will need to be replaced bysyscall_trace_*counterparts.I’ve tested
execexample mentioned in the report with the following patch on rhel9.4 kernels and it works as expected.I also think there is something off regarding the format of the tracepoint and the btf information:
Up to the
preempt_lazy_countfield everything is fine, both sources indicate its offset is 8 (64 bits), but from the id / __syscall_nr the offset is wrongfield:int __syscall_nr; offset:12; size:4; signed:1;(12) vs'id' type_id=32 bits_offset=128(128/8=16).Because of this, bpftrace is failing too: (In this case, it reports wrong information instead of fail)
59 is the syscall id in the architecture I’m running.
So attaching the enter tracepoint works fine but attaching the exit tracepoint fails.
Does the exit tracepoint actually exist on your kernel?
I wonder if attaching on the exit tracepoint of another syscall would work or if it is specific to execve.
The following built-in gadgets use a exit tracepoint. Do they work on the RHEL 9.3 kernel?