firecracker: Cannot obtain Guest OS call graph
To profile the whole processing pipeline from the host to the guest and back for my toy TCP server, I am trying to obtain the call graph of the guest and host from the host to generate a per-syscall/kernel-function-call/guest-kernel-function-call breakdown of the latency per packet. I would like to get the following info (copied from here):
38770.00 20.4% __ticket_spin_lock [guest.kernel.kallsyms]
22560.00 11.9% ftrace_likely_update [kernel.kallsyms]
9208.00 4.8% __lock_acquire [kernel.kallsyms]
However, perf produces no samples in perf.data for the guest OS.
sudo perf kvm --guest --guestmount=/tmp/guestmount record -a sleep 10
sudo perf kvm --guest --guestmount=/tmp/guestmount record -a -e cpu-clock:GH -g -o perf.data
^C[ perf record: Woken up 10 times to write data ]
[ perf record: Captured and wrote 3.720 MB perf.data (25457 samples) ]
sudo perf kvm --guest --guestmount=/tmp/guestmount report -i perf.data --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 25K of event 'cpu-clock:GH'
# Event count (approx.): 6364250000
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............. ......
#
I recompiled the guest kernel with all the appropriate parameters. I am able to get the stack for the host (no guest kallsyms can be found in the output):
sudo perf kvm --host --guest --guestmount=/tmp/guestmount record -a -F 99 -g -o perf.data
sudo perf kvm --host --guest --guestmount=/tmp/guestmount report -i perf.data --stdio
# Samples: 40K of event 'cycles:ppp'
# Event count (approx.): 20557903667
#
# Overhead Command Shared Object Symbol
# ........ ............... ....................... ..........................................................................................................
#
38.37% fc_vcpu0 [kernel.kallsyms] [k] vmx_vcpu_run
7.03% fc_vcpu0 [kernel.kallsyms] [k] kvm_lapic_find_highest_irr
5.21% fc_vcpu0 [kernel.kallsyms] [k] kvm_arch_vcpu_runnable
4.19% fc_vcpu0 [kernel.kallsyms] [k] __apic_update_ppr
4.06% fc_vcpu0 [kernel.kallsyms] [k] ktime_get
2.20% fc_vcpu0 [kernel.kallsyms] [k] kvm_vcpu_check_block
and from inside the guest:
perf record -F 99 -p 2936 -g sleep 10
# Samples: 218 of event 'cpu-clock'
# Event count (approx.): 2202020180
#
# Children Self Command Shared Object Symbol
# ........ ........ ............... ................... ..................................
#
43.58% 5.96% tcp_server ld-musl-x86_64.so.1 [.] __clock_gettime
|
|--40.37%--__clock_gettime
| |
| |--36.70%--__vdso_clock_gettime
| | |
| | |--15.60%--0x7ffc6e0d9991
But unfortunately not both host and guest call graph combined… I would really appreciate any advice or suggestions on how I could fix or troubleshoot this problem.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 15 (15 by maintainers)
I could answer my own request 😃 I managed to collect guest records:
Here are the commands to reproduce:
I am going to continue playing with the output. If I find any problems, I will report it but I think now we can close the issue. Thank you for your help!