symbolic: Project Metabug: Kill Breakpad
High-level goal: improve rust-minidump (and related libraries) to the point that it can replace all the uses of breakpad in Sentry and Firefox.
For now this should be restricted to the scope of:
- x86, x64, ARM, ARM64
- Windows, Android, MacOS, Linux, (iOS?)
Subtasks
NOTE: Larger tasks are checked off even if they have incomplete subtasks to indicate that they are complete for the purposes of the current milestone.
-
Ensure rust-minidump can parse and expose all the minidump details we rely on
-
Replace the derlict breakpad-symbols subcrate with a new symbolication implementation
- Integrate
symbolic
into rust-minidump (https://github.com/luser/rust-minidump/issues/159) - Support grabbing pre-processed symbols from Mozilla’s tecken servers
- Support grabbing symbols for local system libraries (drivers) on crashing clients? (minidump-analyzer)
- Integrate
-
Complete dump_syms support of the various unwinding info formats (via symbolic):
- Potentially create a new binary intermediate representation to replace breakpad’s text format
- Implement Microsoft PDB Exception Tables
- Implement DWARF CFI
- Handle the Hard Cases? (may be inexpressible by breakpad symbols)
- Implement Apple Compact Unwind Info
- land base impl in symbolic (https://github.com/getsentry/symbolic/pull/372/)
- upstream into goblin? (https://github.com/m4b/goblin/pull/271)
- ARM64 opcodes
- x86/x64 Stack-Indirect opcodes(?) (don’t seem to occur in firefox)
- land base impl in symbolic (https://github.com/getsentry/symbolic/pull/372/)
-
Implement an (offline) unwinder
- Potentially make a new independent crate for this, instead of being in minidump-processor
- ARM stack walker
- Scanning
- Frame-pointer-based
- CFI-based
- x86/x64 stack walker
- Scanning (https://github.com/luser/rust-minidump/pull/145)
- Frame-pointer-based (https://github.com/luser/rust-minidump/pull/145)
- CFI-based
- Can handle native unwinding tables (minidump-analyzer)
- PDB
- DWARF CFI
- Compact Unwind Info
- Can run online (moz-stackwalk)
- Can be a backtracer (moz-stackwalk)
- Can implement the
panic!
usecase (handling personality/lsdas to run dtors, catch_panic) (no use, just cool) - nostd compatible (don’t allocate!) (no use, just cool)
-
Implement client-side minidump generation (Bugzilla#1588530) (moz-breakpad-client, sentry-breakpad-client)
- Can invoke native windows minidump APIs
- Can generate fake minidump on Linux
- Integrate minidump_writer_linux
- Can generate fake minidump on MacOS
- Can generate fake minidump on Android
- Can generate fake minidump on iOS? (only Sentry would need this?)
The Context
Minidumps are a Microsoft-designed format for more compact dumps of a process’s state when it crashes, notably including full memory dumps of every thread’s stacks/registers and mapped code modules (libraries that are linked in and what addresses they were mapped to).
Windows has native APIs for generating minidumps, but this is a feature that’s desirable on other platforms, so google-breakpad was created to generate “fake” minidumps on other platforms and process them all uniformly. The most important output of this process is backtraces for every thread, but additional context stored in minidumps may be useful for debugging weird stuff like “the user’s antivirus DLL-injected itself into out process and messed everything up” or “oh look the last syscall failed right before we crashed”.
Both Firefox and Sentry rely on breakpad for minidump generation and handling. Unfortunately, breakpad is written in dangerous C++ and basically abandoned by google. Mozilla doesn’t bother upstreaming our patches anymore, and it’s too much work to maintain it.
Usecases
Here’s the places where we use breakpad now that should work with a replacement. Each has a codename so that tasks/milestones can reference them.
Mozilla Usecases
-
minidump-stackwalk: On the server-side, Mozilla uses breakpad in minidump-stackwalk to process minidump-based crash reports for socorro.
-
moz-breakpad-client: On the client-side, Mozilla uses breakpad in our crash-reporter to generate minidumps. For content-process (~tab) crashes, the main process does this work out-of-crashing-process. For main-process (full browser) crashes, the main-process does this work in-crashing-process. Ideally we would have a separate crash-reporting process on the side that monitors the others so that all our handling can be out-of-crashing-process.
-
minidump-analyzer: On the client-side, Mozilla uses breakpad in our minidump-analyzer to try to analyze the contents of the minidump using the client machine’s knowledge of its own system libraries and any local debuginfo we ship with firefox. This allows us to get more accurate symbolication/unwinding. (This also includes some of our own adhoc symbolication/unwinding code which is Buggy) and ideally would be replaced
-
moz-stackwalk: As a stretch-goal, this work would ideally also replace the need for moz-stackwalk (our own runtime backtracer for debug build backtraces and profiler probing) and fix-stacks (cleans up moz-stackwalk’s output using native symbols).
Sentry Usecases
-
symbolicator: On the server-side, Sentry uses breakpad inside of symbolicator to process minidumps and extract a meaningful stack trace.
-
sentry-breakpad-client: On the client-side, sentry-native uses crashpad or alternatively breakpad to create minidumps of the crashing process to send over for server-side post-processing.
Microsoft Usecases?
TBD!
Current Roadmap
Milestone 1 - minidump-stackwalk
Metabug: https://github.com/luser/rust-minidump/issues/153
Mozilla would like to first get the minidump-stackwalk usecase working, as it’s the simplest but also very high traffic (performance matters), and processing user-provided data on our servers (security matters).
minidump-stackwalk only needs to handle pre-processed symbols from our symbol servers (i.e. the breakpad text format), and is operating completely offline from where the minidump was generated.
The hardest part will be generating backtraces for all the threads, which requires a complete offline unwinder.
Milestone 2 - minidump-analyzer
TBD, may choose different goal based on how Milestone 1 goes
Milestone 3 - symbolicator
TBD, may choose different goal based on how Milestone 2 goes
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 5
- Comments: 20 (18 by maintainers)
We removed breakpad from our own symbolicator service 🎉 , and the code in this repo is now behind a feature flag, and will be removed completely on the next major.