tracee: Initial BPFMaps population freezes tracee-ebpf
While developing a fix for https://github.com/aquasecurity/tracee/issues/862, and loading the embedded CO-RE eBPF object, I realized that sometimes the logic worked and sometimes it did not.
From time to time the entire tracee-ebpf code was in a state where it could not receive (or display) any event:
BTF enabled, attempting to unpack CORE bpf object
unpacked CO:RE bpf object file into memory
TIME UTS_NAME CONTAINER_ID UID COMM PID/host TID/host RET EVENT ARGS
<nothing>
and nothing happened.
The logic being added to PopulateMap is:
// Initialize pid_to_cont_id_map if tracing containers
c := Containers{}
err := c.Populate()
if err != nil {
return err
}
bpfPidToContIdMap, _ := t.bpfModule.GetMap("pid_to_cont_id_map")
for _, contId := range c.GetContainers() {
for _, pidstr := range c.GetPids(contId) {
if t.config.Debug {
fmt.Println("Running container =", contId, "pid =", pidstr)
}
var pid uint32
_, err = fmt.Sscanf(pidstr, "%d", &pid)
err = bpfPidToContIdMap.Update(pid, []byte(contId))
if err != nil {
return err
}
}
}
Initially I thought it was related to my logic, but debug always showed me that the slices of container_id and pids were ok:
$ sudo ./dist/tracee-ebpf --debug --trace container --trace event=execve
BTF enabled, attempting to unpack CORE bpf object
unpacked CO:RE bpf object file into memory
Running container = 0a829b3bc00d7f1b393d070c0f3e5d1929a186df1b41b4cd2bf95525f495aa55 pid = 1721925
Running container = 0a829b3bc00d7f1b393d070c0f3e5d1929a186df1b41b4cd2bf95525f495aa55 pid = 1726069
...
and there were no errors adding the pids to the map:
err = bpfPidToContIdMap.Update(pid, []byte(contId))
I’m also able to reproduce this behavior using the versioned eBPF object file:
$ sudo TRACEE_BPF_FILE="$(pwd)/dist/tracee.bpf.5_11_0-24-generic.v0_6_0-11-g49503a2.o" ./dist/tracee-ebpf --debug --trace container --trace event=execve
BPF object file specified by TRACEE_BPF_FILE found: /home/rafaeldtinoco/work/sources/ebpf/aquasec-tracee/tracee-ebpf/dist/tracee.bpf.5_11_0-24-generic.v0_6_0-11-g49503a2.oRunning container = 0a829b3bc00d7f1b393d070c0f3e5d1929a186df1b41b4cd2bf95525f495aa55 pid = 1721925
Running container = 0a829b3bc00d7f1b393d070c0f3e5d1929a186df1b41b4cd2bf95525f495aa55 pid = 1726069
Running container = 9521c39d6d3d54f5c5f4760c2e3dbde4fdcfd4fba99c2d46c49a4edd63864ae3 pid = 1236570
Running container = 9521c39d6d3d54f5c5f4760c2e3dbde4fdcfd4fba99c2d46c49a4edd63864ae3 pid = 1236611
...
TIME UTS_NAME CONTAINER_ID UID COMM PID/host TID/host RET
EVENT ARGS
<nothing>
I’m running Ubuntu Hirsute: 5.11.0-24-generic #25-Ubuntu with BTF enabled.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 18
Commits related to this issue
- tracee-ebpf: update to libbpfgo v0.2.0-libbpf_0.4.0 Fixes: #894 The fix for #862 ("containerid doesn't work for existing containers") exposed the issue #894 ("Initial BPFMaps population freezes trac... — committed to rafaeldtinoco/tracee by rafaeldtinoco 3 years ago
- tracee-ebpf: update to libbpfgo v0.2.0-libbpf_0.4.0 Fixes: #894 The fix for #862 ("containerid doesn't work for existing containers") exposed the issue #894 ("Initial BPFMaps population freezes trac... — committed to rafaeldtinoco/tracee by rafaeldtinoco 3 years ago
OK, so we better wait with the merge of #864 until figuring this out
Also, here is an interesting thing to note:
Which means that libbpf populates a kconfig map when BTF is enabled. This might be related to what we are investigating in #851
When you reproduce this freeze, I suggest that you do the following: