parca-agent: [BUG] fail to unwind stack trace, lose data in one thread
Describe the bug
Run a simple cpp demo, but failing to parse symbols and failing to get profiling info from one thread.
I am build the agent and parca server from the latest main branch.
To Reproduce
#include <iostream>
#include <thread>
#include <chrono>
#include <cstring>
#include <unistd.h>
#include <sys/epoll.h>
void epollf() {
int epollFd = epoll_create1(0);
if (epollFd == -1) {
std::cerr << "Failed to create epoll instance: " << std::strerror(errno) << std::endl;
return;
}
struct epoll_event event;
event.events = EPOLLIN;
while (true) {
struct epoll_event events[10];
int numEvents = epoll_wait(epollFd, events, 1000, 0);
if (numEvents == -1) {
std::cerr << "epoll_wait error: " << std::strerror(errno) << std::endl;
break;
}
}
close(epollFd);
}
int main() {
std::thread epollThread(epollf);
while (true) {
}
epollThread.join();
return 0;
}
clang++ -g -o jjjttt main.cpp -lpthread ./jjjttt
Expected behavior
Profiling with all symobls.
Screenshots or Profiles (using https://pprof.me)
Only one thread is proflied…
The other thread has
Logs
sudo cat /sys/kernel/debug/tracing/trace_pipe
jjjttt-2177205 [005] d.h.. 5107453.799553: bpf_trace_printk: ~about to check shards found=1
jjjttt-2177205 [005] d.h.. 5107453.799553: bpf_trace_printk: ~checking shards now
jjjttt-2177205 [005] d.h.. 5107453.799554: bpf_trace_printk: [info] found chunk
jjjttt-2177205 [005] d.h.. 5107453.799554: bpf_trace_printk: pid 2177205 tgid 2177205
jjjttt-2177205 [005] d.h.. 5107453.799555: bpf_trace_printk: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
jjjttt-2177205 [005] d.h.. 5107453.799555: bpf_trace_printk: traversing stack using .eh_frame information!!
jjjttt-2177205 [005] d.h.. 5107453.799556: bpf_trace_printk: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
jjjttt-2177205 [005] d.h.. 5107453.799556: bpf_trace_printk: [debug] Within unwinding machinery loop
jjjttt-2177205 [005] d.h.. 5107453.799557: bpf_trace_printk: ## frame: 0
jjjttt-2177205 [005] d.h.. 5107453.799557: bpf_trace_printk: current pc: 559071098394
jjjttt-2177205 [005] d.h.. 5107453.799557: bpf_trace_printk: current sp: 7fff9cabaa10
jjjttt-2177205 [005] d.h.. 5107453.799558: bpf_trace_printk: current bp: 7fff9cabaa20
jjjttt-2177205 [005] d.h.. 5107453.799558: bpf_trace_printk: ~about to check shards found=1
jjjttt-2177205 [005] d.h.. 5107453.799558: bpf_trace_printk: ~checking shards now
jjjttt-2177205 [005] d.h.. 5107453.799559: bpf_trace_printk: [info] found chunk
jjjttt-2177205 [005] d.h.. 5107453.799559: bpf_trace_printk: le offset: 559071097000
jjjttt-2177205 [005] d.h.. 5107453.799560: bpf_trace_printk: ========== left 74625 right 74883
jjjttt-2177205 [005] d.h.. 5107453.799560: bpf_trace_printk: .done
jjjttt-2177205 [005] d.h.. 5107453.799560: bpf_trace_printk: => table_index: 74651
jjjttt-2177205 [005] d.h.. 5107453.799560: bpf_trace_printk: => adjusted pc: 1394
jjjttt-2177205 [005] d.h.. 5107453.799561: bpf_trace_printk: cfa type: 1, offset: 16 (row pc: 1374)
jjjttt-2177205 [005] d.h.. 5107453.799563: bpf_trace_printk: [error] previous_rip should not be zero. This can mean that the read failed, ret=-14 while reading @ 7fff9cabaa28.
agent log no obvious error.
parca server log(some symbols are missing as well)
Has storage symbolization request failed" err=“fetching metadata: debuginfo metadata not found” and ErrUnknownPC
level=debug name=parca ts=2023-09-25T08:57:31.519863966Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=69389d485a9793dbe873f0ea2c93e02efaa9aa3d pc_range_start=0x28700 pc_range_end=0x1bc4c1
level=debug name=parca ts=2023-09-25T08:57:31.519933345Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=3146e5a5e66e1fd6ab59a39e486a9d246bc675c0 pc_range_start=0x2090 pc_range_end=0x2b375
level=debug name=parca ts=2023-09-25T08:57:31.519963794Z caller=symbolizer.go:299 component=symbolizer buildid=unknown msg="storage symbolization request failed" err="fetching metadata: debuginfo metadata not found"
level=debug name=parca ts=2023-09-25T08:57:31.520031807Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=e22ba7829a55a0dec2201a0b6dac7ba236118561 pc_range_start=0xe030 pc_range_end=0x1e19b
level=debug name=parca ts=2023-09-25T08:57:31.520112609Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=f90a926c28da963c0888cca1180ef28bb88f059b pc_range_start=0x2030 pc_range_end=0x2488
level=debug name=parca ts=2023-09-25T08:57:31.520184515Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=d7435f9680bf6c38f11856638bbbc8a2b755e054 pc_range_start=0x2030 pc_range_end=0x554d0
level=debug name=parca ts=2023-09-25T08:57:31.520219354Z caller=symbolizer.go:299 component=symbolizer buildid=unknown msg="storage symbolization request failed" err="fetching metadata: debuginfo metadata not found"
level=debug name=parca ts=2023-09-25T08:57:31.520289037Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=927b5ce7c32980646eaf7527a8b3d68c3da5ffea pc_range_start=0x10a0 pc_range_end=0x12d8
level=debug name=parca ts=2023-09-25T08:57:31.520358214Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=5ea13eaa57911db0 pc_range_start=0xafb0 pc_range_end=0x4c023
level=debug name=parca ts=2023-09-25T08:57:31.520386623Z caller=symbolizer.go:299 component=symbolizer buildid=unknown msg="storage symbolization request failed" err="fetching metadata: debuginfo metadata not found"
level=debug name=parca ts=2023-09-25T08:57:31.520453982Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=4b60568bc25641b07e93f84f98c77093f3b279f7 pc_range_start=0x6030 pc_range_end=0x1833e
level=debug name=parca ts=2023-09-25T08:57:31.520527563Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=7ab150049702ea494c7c9b681d86dc2d59c89ee1 pc_range_start=0xb030 pc_range_end=0x4ebcd
level=debug name=parca ts=2023-09-25T08:57:31.520600582Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=0665411b8b4fc787710290aacecac4b069bf89fe pc_range_start=0x4030 pc_range_end=0x4718
level=debug name=parca ts=2023-09-25T08:57:31.520681385Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=55447a37514c4a317439786251326b5f762d31392f6f6b7835704d635279724779346e4d6e78414a6a2f6d647a5247354a536e6e38616e6c31636c5954612f38496435624e72466c744770475332794967704c pc_range_start=0x46aaa0 pc_range_end=0xd53aa8
level=debug name=parca ts=2023-09-25T08:57:31.520760407Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=7bf66e3d4dd23edad286982c17e7df08d1ab6fb2 pc_range_start=0x10a0 pc_range_end=0x12e2
level=debug name=parca ts=2023-09-25T08:57:31.520839323Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=7d78d330de33842118d405b4afe4d58878d71b41 pc_range_start=0xb030 pc_range_end=0x73617
**level=debug name=parca ts=2023-09-25T08:57:31.520871855Z caller=symbolizer.go:299 component=symbolizer buildid=unknown msg="storage symbolization request failed" err="fetching metadata: debuginfo metadata not found"**
level=debug name=parca ts=2023-09-25T08:57:31.520939133Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=cc278076d1c0ecf82b5248dfffe7e2d43feafbb7 pc_range_start=0x7030 pc_range_end=0x7ee8
level=debug name=parca ts=2023-09-25T08:57:31.521048469Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=cf6dd909194b0a9dd65eb30f7ec27c4002d0329d pc_range_start=0x3f030 pc_range_end=0x1fdb34
level=debug name=parca ts=2023-09-25T08:57:31.521126984Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=3192c14e4fe2624ff4675ae5a5edfa8a599aa9db pc_range_start=0x2030 pc_range_end=0x2578
level=debug name=parca ts=2023-09-25T08:57:31.521195289Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=d11886324c0b0501c46b595089d667e6894943af pc_range_start=0x3030 pc_range_end=0xa8e7
level=debug name=parca ts=2023-09-25T08:57:31.521273608Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=6673306d4d734c383466637842373044627642442f5a4b76785f546b716f435f2d506957696b447a492f4270345f73557834666d616445414b796b64555f2f4e524d525f67667a4c365a554361577041657232 pc_range_start=0x401e40 pc_range_end=0x49b0a85
level=debug name=parca ts=2023-09-25T08:57:31.521462171Z caller=symbolizer.go:411 component=symbolizer msg="no locations to symbolize" build_id=229b7dc509053fe4df5e29e8629911f0c3bc66dd pc_range_start=0x28700 pc_range_end=0x1bc501
level=debug name=parca ts=2023-09-25T08:57:31.521507517Z caller=symbolizer.go:299 component=symbolizer buildid=036dde3b7b7d0159 msg="storage symbolization request failed" err="fetching metadata: debuginfo metadata not found"
level=debug name=parca ts=2023-09-25T08:57:31.521520533Z caller=symbolizer.go:289 component=symbolizer msg="mapping of location is empty, skipping"
level=debug name=parca ts=2023-09-25T08:57:31.521590103Z caller=symbolizer.go:516 component=symbolizer msg="symbolizing locations" build_id=e37fe1a879783838de78cbc8c80621fa685d58a2 count=2
level=debug name=parca ts=2023-09-25T08:57:31.522622451Z caller=symbolizer.go:590 component=symbolizer **msg="symbolized location" build_id=e37fe1a879783838de78cbc8c80621fa685d58a2 address=651472 lines_count=0 err="seek to PC: ErrUnknownPC" liner_type=*addr2line.DwarfLiner
level=debug name=parca ts=2023-09-25T08:57:31.523523449Z caller=symbolizer.go:590 component=symbolizer msg="symbolized location" build_id=e37fe1a879783838de78cbc8c80621fa685d58a2 address=654128 lines_count=0 err="seek to PC: ErrUnknownPC" liner_type=*addr2line.DwarfLiner**
level=debug name=parca ts=2023-09-25T08:57:31.52356687Z caller=symbolizer.go:191 component=symbolizer msg="symbolization loop completed"
level=debug name=parca ts=2023-09-25T08:57:37.245414226Z caller=badger.go:452 msg=GetOrCreateStacktraces stacktrace_keys_len=49
level=debug name=parca ts=2023-09-25T08:57:37.246795774Z caller=badger.go:58 msg="writeRequests called. Writing to value log"
level=debug name=parca ts=2023-09-25T08:57:37.246872034Z caller=badger.go:58 msg="Writing to memtable"
level=debug name=parca ts=2023-09-25T08:57:37.246937884Z caller=badger.go:58 msg="Sending updates to subscribers"
level=debug name=parca ts=2023-09-25T08:57:37.246969165Z caller=badger.go:58 msg="2 entries written"
level=debug name=parca ts=2023-09-25T08:57:37.266391517Z caller=server.go:314 msg="finished call" protocol=grpc grpc.component=server grpc.service=parca.profilestore.v1alpha1.ProfileStoreService grpc.method=WriteRaw grpc.method_type=unary peer.address=127.0.0.1:59764 grpc.start_time=2023-09-25T08:57:37Z grpc.request.deadline=2023-09-25T08:59:37Z grpc.code=OK grpc.time_ms=24.529
Software (please complete the following information):
- Parca Agent Version: [
parca-agent --version
] - Parca Server Version (if applicable): [
parca --version
]
Workload (please complete the following information):
- Runtime (if applicable):
- Compiler (if applicable):
Environment (please complete the following information):
- Linux Distribution: [
cat /etc/*-release
] DISTRIB_ID=Ubuntu DISTRIB_RELEASE=22.04 DISTRIB_CODENAME=jammy DISTRIB_DESCRIPTION=“Ubuntu 22.04.3 LTS” PRETTY_NAME=“Ubuntu 22.04.3 LTS” NAME=“Ubuntu” VERSION_ID=“22.04” VERSION=“22.04.3 LTS (Jammy Jellyfish)” VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL=“https://www.ubuntu.com/” SUPPORT_URL=“https://help.ubuntu.com/” BUG_REPORT_URL=“https://bugs.launchpad.net/ubuntu/” PRIVACY_POLICY_URL=“https://www.ubuntu.com/legal/terms-and-policies/privacy-policy” UBUNTU_CODENAME=jammy - Linux Version: [
uname -r
] 5.15.0-78-generic - Arch: [
uname -m
] x86_64 - Kubernetes Version (if applicable): [
kubectl --version
] no - Container Runtime (if applicable): no
Additional context
About this issue
- Original URL
- State: open
- Created 9 months ago
- Comments: 17 (13 by maintainers)
I meet a similar problem in ubuntu 22.04 as well. I try to disable swap for a process with cgroup v2 like the following:
more symbols are resolved after disabling swap, but I can still see errors for the above process
@javierhonduco .
The latest update is this issue seems to be related to some security enhancements we patched on ubuntu. Still trying to identify the specific change.
update: with another environment, the symbol resolution looks fine here and no “rip should not be zero” for the simple example.
and the bookmark looks like this(without disabling swap):
environment:
it’s weird that the symbol resolution becomes as poor as it used to be with another a few runs…