cilium: CI: K8sVerifier Runs the kernel verifier against Cilium's BPF datapath: libbpf: Error in bpf_object__probe_loading():Operation not permitted(1)

Test Name

K8sVerifier Runs the kernel verifier against Cilium's BPF datapath

Failure Output

FAIL: Failed to load BPF program bpf_lxc with datapath configuration:
-DSKIP_DEBUG=1 -DENABLE_IPV4=1 -DENABLE_IPV6=1 -DENABLE_HOST_SERVICES_TCP=1 -DENABLE_HOST_SERVICES_UDP=1 -DENABLE_HOST_REDIRECT=1 -DENABLE_ROUTING=1 -DNO_REDIRECT=1 -DPOLICY_VERDICT_NOTIFY=1 -DALLOW_ICMP_FRAG_NEEDED=1 -DENABLE_IDENTITY_MARK=1 -DMONITOR_AGGREGATION=3 -DCT_REPORT_FLAGS=0x0002 -DENABLE_HOST_FIREWALL=1 -DENABLE_ICMP_RULE=1 -DENABLE_CUSTOM_CALLS=1 -DENABLE_IPSEC=1 -DIP_POOLS=1 -DENCAP_IFINDEX=1 -DTUNNEL_MODE=1

Stack Trace

Expected command: kubectl exec -n default test-verifier -- env TC_PROGS="" XDP_PROGS="" CG_PROGS="" TC_PROGS="bpf_lxc" ./test/bpf/verifier-test.sh 
To succeed, but it failed:
Exitcode: 1 
Err: exit status 1

Standard Output

=> Loading bpf_lxc.c:from-container...

Standard Error

[..]
	 libbpf: prog '__send_drop_notify': unrecognized ELF section name '2/1'
	 libbpf: prog 'tail_icmp6_send_echo_reply': unrecognized ELF section name '2/3'
	 libbpf: prog 'tail_icmp6_send_time_exceeded': unrecognized ELF section name '2/5'
	 libbpf: prog 'tail_icmp6_handle_ns': unrecognized ELF section name '2/4'
	 libbpf: prog 'tail_handle_ipv6_cont': unrecognized ELF section name '2/26'
	 libbpf: prog 'tail_ipv6_ct_egress': unrecognized ELF section name '2/32'
	 libbpf: prog 'tail_handle_ipv6': unrecognized ELF section name '2/10'
	 libbpf: prog 'tail_handle_ipv4_cont': unrecognized ELF section name '2/25'
	 libbpf: prog 'tail_ipv4_ct_egress': unrecognized ELF section name '2/29'
	 libbpf: prog 'tail_handle_ipv4': unrecognized ELF section name '2/7'
	 libbpf: prog 'tail_handle_arp': unrecognized ELF section name '2/6'
	 libbpf: prog 'handle_xgress': unrecognized ELF section name 'from-container'
	 libbpf: prog 'tail_ipv6_policy': unrecognized ELF section name '2/12'
	 libbpf: prog 'tail_ipv6_to_endpoint': unrecognized ELF section name '2/14'
	 libbpf: prog 'tail_ipv6_ct_ingress_policy_only': unrecognized ELF section name '2/31'
	 libbpf: prog 'tail_ipv6_ct_ingress': unrecognized ELF section name '2/30'
	 libbpf: prog 'tail_ipv4_policy': unrecognized ELF section name '2/11'
	 libbpf: prog 'tail_ipv4_to_endpoint': unrecognized ELF section name '2/13'
	 libbpf: prog 'tail_ipv4_ct_ingress_policy_only': unrecognized ELF section name '2/28'
	 libbpf: prog 'tail_ipv4_ct_ingress': unrecognized ELF section name '2/27'
	 libbpf: prog 'handle_policy': unrecognized ELF section name '1/0xffff'
	 libbpf: prog 'handle_to_container': unrecognized ELF section name 'to-container'
	 libbpf: Error in bpf_object__probe_loading():Operation not permitted(1). Couldn't load trivial BPF program. Make sure your kernel supports BPF (CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is set to big enough value.
	 libbpf: failed to load object './test/bpf/../../bpf/bpf_lxc.o'
	 Unable to load program
	 command terminated with exit code 1

Resources

Anything else?

Only observed on 4.9 so far

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 23 (23 by maintainers)

Commits related to this issue

Most upvoted comments

Got it!!

This is an underflow on the rlimit counter indeed. Adding printk()s to the kernel, I can observe four program loads charged for 6 pages, but then uncharged for 7. After uncharging those, the rlimit counter (user->locked_vm) is at 18446744073709551612 instead of 0.

Looking at the kernel code, prog->pages (the value changing from 6 to 7) can indeed be modified if the BPF program is reallocated. This happens for example if we add new instructions and they don’t fit on the last page used by the program. Looking further, this has been fixed (hi Daniel!) on newer versions:

bpf: fix overflow in prog accounting

Commit aaac3ba95e4c ("bpf: charge user for creation of BPF maps and
programs") made a wrong assumption of charging against prog->pages.
Unlike map->pages, prog->pages are still subject to change when we
need to expand the program through bpf_prog_realloc(). [...]

It is likely that the change in the compilation options from https://github.com/cilium/cilium/pull/19938/commits brought the number of instructions of a program just under a multiple of PAGE_SIZE, and it goes over the threshold when the verifier adds the prologue or patches the context accesses.

Now for the bad news: Daniel’s fix is in 4.10, but apparently it was never backported to 4.9. I suppose the cleanest way to fix this would be to send a backport, and to update the image to 4.9.y after the patch has been merged.

Greg took the patch, it’s currently in the queue for 4.9.

Edit 2022-08-17: Made it to the queue/4.9 of linux-stable-rc. Should be in v4.9.326.

It seems that the rlimit bump is not happening.

I expected either libbpf or tc directly to do it. It turns aout that the rlimit bump in libbpf is more recent than I thought, and our libbpf fork doesn’t have it. Then iproute2’s tc does have a rlimit bump, but in lib/bpf_legacy.c.

So what probably happened is that at some point our iproute2 version switched to libbpf and stopped raising the rlimit itself; but the libbpf version it uses is ~4 months too old to have libbpf’s rlimit bump. So probably no component raises the rlimit, and sometimes, it’s not enough.

The failure seems limited to 4.9

I’m not aware of any difference about rlimit handling on kernel versions (other than the switch to cgroup-based obviously), although we could imagine that the delay for reclaiming the memory when programs/maps are unloaded could be slightly longer on old kernels due to implementation details. Looks like a race anyway, since it doesn’t trigger all the time.

Some potential workarounds:

  • ulimit -l in terminal before launching the tests?
  • Upgrade the libbpf fork, then the iproute2 image