bcc: memleak tool misjudge memory leak because of the hardware interrupt

Hi all,

We use memleak tool, and find out some problems in the hardware interrupt environment.

Command

memleak --combined-only -z 64 -Z 64

And it said there are memory leaks in the USB driver

640 bytes in 10 allocations from stack
--
__kmalloc+0x178 [kernel]
__kmalloc+0x178 [kernel]
xhci_urb_enqueue+0x140 [kernel]
usb_hcd_submit_urb+0x5e0 [kernel]
usb_submit_urb+0x3e4 [kernel]
usb_start_wait_urb+0x54 [kernel]
usb_control_msg+0xb4 [kernel]

But we find out it’s misjudged by memleak tool after our debugging. Debug patch

diff --git a/tools/memleak.py b/tools/memleak.py
index 5399019..1aa45fd 100755
--- a/tools/memleak.py
+++ b/tools/memleak.py
@@ -226,6 +226,8 @@ static inline int gen_alloc_exit(struct pt_regs *ctx) {
 static inline int gen_free_enter(struct pt_regs *ctx, void *address) {
         u64 addr = (u64)address;
         struct alloc_info_t *info = allocs.lookup(&addr);
+
+       bpf_trace_printk("Edward:free address:%x\\n", address);
         if (info == 0)
                 return 0;

Leak part: Allocate memory e2379a00

  RTW_CMD_THREAD-1984  [000] ....   681.302707: kmalloc: call_site=c0779cdc ptr=e2379a00 bytes_req=40 bytes_alloc=64 gfp_flags=GFP_NOIO|__GFP_ZERO

Free memory e2379a00

 HTTPREQUEST_MAN-9090  [002] ...2   681.302707: tracing_mark_write: Edward:free address:e94ff480
 HTTPREQUEST_MAN-9090  [002] ...2   681.302714: tracing_mark_write: Edward:free address:e956a600
 HTTPREQUEST_MAN-9090  [002] ...2   681.302721: tracing_mark_write: Edward:free address:ecac50c0
 HTTPREQUEST_MAN-9090  [002] ...2   681.302727: tracing_mark_write: Edward:free address:edb9b6c0
 HTTPREQUEST_MAN-9090  [002] d.h6   681.302727: xhci_urb_free_priv <-xhci_irq
 HTTPREQUEST_MAN-9090  [002] d.h5   681.302727: xhci_urb_free_priv: Edward:xhci_urb_free_priv:1906 free priv
 HTTPREQUEST_MAN-9090  [002] d.h5   681.302727: kfree: call_site=c077da78 ptr=e2379a00
 HTTPREQUEST_MAN-9090  [002] d.h5   681.302727: kfree: call_site=c077da80 ptr=e29f6f40

You can see there is no any “tracing_mark_write” after kfree event, it doesn’t call the attached function gen_free_enter. Because at 681.302727 timestamps, gen_free_enter is not finished but interrupted by xhci_irq. Therefore kfree event during xhci_irq which won’t call gen_free_enter().

It causes the incorrect result.

Please tell me if I have any misunderstanding.

Thanks!

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 19 (13 by maintainers)

Most upvoted comments

Thanks @willfindlay . But I think per CPU array is impossible on memleak bcc tool. Due to different CPU will handle the same address. For example: CPU 0 allocate 0x200 addr CPU 1 isr free 0x200 addr

And

*		* Tracing programs and socket filter programs cannot use
--
*		  **bpf_spin_lock**\ () due to insufficient preemption checks
*		  (but this may change in the future).

This limitation since v5.1 to now, I think there should be a serious problem. Possibly too many different cases need to handle which causes hard to implement in the tracing program?

I have a good idea that maybe can solve this problem simply, will get back after my experiment