rbspy: bug: Early exits, duration parameter not respected in some cases

In some cases, we see that rbspy exits early, both with and without the --duration parameter.

Without the duration parameter, it can sometimes exit with an exit code of 0. The target process has not exited, and there is no apparent error in the output.

With the duration parameter, we can sometimes see it exit well before the duration has elapsed:

$ ./rbspy-x86_64-unknown-linux-gnu record -p 908389 -d 120

Time since start: 31s. Press Ctrl+C to stop.
Summary of profiling data so far:
% self  % total  name
 81.04    81.14  select [c function] - (unknown)
  9.38     9.38  pop [c function] - (unknown)
  2.42     2.42  sleep [c function] - (unknown)
  0.77     0.77  wait_readable [c function] - (unknown)
  0.77     0.77  initialize [c function] - (unknown)
  0.58     0.58  _query [c function] - (unknown)
  0.45     2.00  new [c function] - (unknown)
  0.19     0.52  initialize - /artifacts/bundle/ruby/2.7.0/gems/graphql-1.12.13/lib/grap
  0.19     0.19  tcp_listener_stats [c function] - (unknown)
  0.13     0.13  split [c function] - (unknown)
  0.10     0.10  match? [c function] - (unknown)
  0.06     5.58  public_send [c function] - (unknown)
  0.06     3.93  block in run - /app/app/models/engine.rb
  0.06     1.39  block in coerce_into_values - /artifacts/bundle/ruby/2.7.0/gems/graphql
  0.06     0.68  synchronize [c function] - (unknown)
  0.06     0.48  block in rules_from_db - /app/app/models/ordered_ruleset.rb
  0.06     0.06  legacy_connection_handling - /artifacts/bundle/ruby/2.7.0/gems/activere
  0.06     0.06  inheritance_column - /artifacts/bundle/ruby/2.7.0/gems/activerecord-6.1
  0.06     0.06  eval [c function] - (unknown)
  0.06     0.06  clock_gettime [c function] - (unknown)
1.8% (56/3109) of stack traces were sampled late because we couldn't sample at expected rate, results may be inaccurate. Current rate: 100. Try sampling at a lower rate with `--rate`.
Wrote raw data to /root/.cache/rbspy/records/rbspy-2021-07-22-1cem3j2wTP.raw.gz
Writing formatted output to /root/.cache/rbspy/records/rbspy-2021-07-22-yrFdFN7bCV.flamegraph.svg

In this example the target pid was still alive after rbspy exited.

I would expect that if no duration parameter is given, that rbspy waits until it is interrupted. If a duration parameter is given, it should wait until the duration has elapsed before exiting.

Any ideas @acj ?

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 27 (25 by maintainers)

Commits related to this issue

Most upvoted comments

The upstream fix landed, and I just released 0.8.1. Please give it a try when you can. If the EPERM issue still appears, I think we can simplify #337 to specifically handle that case.

I updated #337 and included the test program @acj posted above. It seems this also exposed another case of spurious exits, but from EPERM, which it also appears is being swallowed in the existing subprocess test.

I’m having a pretty similar issue, rbspy will detach thinking the process has ended. This is confirmed by setting RUST_LOG=debug, which gives me the following:

$ sudo RUST_LOG=debug ./rbspy record --raw-file rbspy-XXX-$(date +"%FT%H%M%S").tar --format speedscope --file rbspy-XXX-$(date +"%FT%H%M%S").json --pid 25770
...
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 25799
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 25800
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 25801
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 39337
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 39351
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 39385
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 39387
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 39388
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 5186
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 14455
[2021-07-23T20:31:17Z DEBUG remoteprocess::linux] detached from thread 25568
[2021-07-23T20:31:17Z DEBUG rbspy] Process 25770 ended
Wrote raw data to rbspy-XXX-2021-07-23T203052.tar
Writing formatted output to rbspy-XXX-2021-07-23T203052.json

I’ll try to add some additional logging to a local build to get a better idea of what’s happening.