esp-idf: Guru Meditation (LoadProhibited) in _Unwind_RaiseException (IDFGH-3388)
Environment
- Development Kit: none
- Module or chip used: ESP32-WROVER
- IDF version (run
git describe --tagsto find it): v4.0-386-gb0f053d82 - Build System: CMake, idf.py
- Compiler version (run
xtensa-esp32-elf-gcc --versionto find it): 8.2.0, both 2019r2 and 2020r1 - Operating System: Linux
- Using an IDE?: No
- Power Supply: external 3.3V
Problem Description
Sometimes, our ESP32 code randomly panics with a LoadProhibited error on our custom ESP32 board:
Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.
Core 0 register dump:
PC : 0x402d05f0 PS : 0x00060b30 A0 : 0x80296b66 A1 : 0x3ffebcd0
A2 : 0x00000007 A3 : 0x3fff7470 A4 : 0x3fff7490 A5 : 0x3ffdb490
A6 : 0x00000003 A7 : 0x00060023 A8 : 0x802d05f0 A9 : 0x3ffebcb0
A10 : 0x00000000 A11 : 0x3ffebcd0 A12 : 0x00000001 A13 : 0x00000004
A14 : 0x00000003 A15 : 0x00000004 SAR : 0x00000020 EXCCAUSE: 0x0000001c
EXCVADDR: 0x0006003b LBEG : 0x40098a2c LEND : 0x40098a48 LCOUNT : 0x00000000
ELF file SHA256: fbfbb5f5b950a2d7
Backtrace: 0x402d05ed:0x3ffebcd0 0x40296b63:0x3ffebdd0 0x4010ec05:0x3ffebfd0 0x40101bb6:0x3ffec050 0x402d65cd:0x3ffec0f0 0x4023cacd:0x3ffec120 0x4023cb89:0x3ffec160 0x40093ddd:0x3ffec180
Rebooting...
addr2line matches the backtrace addresses we gathered from the logs above with the following call trace:
0x402d05ed: _Unwind_RaiseException at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/gcc/libgcc/unwind.inc:140
0x40296b63: __cxa_throw at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:90
0x4010ec05: hapbridge::set_response_ble_change(unsigned short, unsigned int, blemesh_val_t, std::optional<blemesh_val_t>) at /home/marco/Workspace/our_project/build/../components/proj_homekit/hapbridge_callbacks_notify_ble.cpp:438
0x40101bb6: blemesh_set_report_handler at /home/marco/Workspace/our_project/build/../components/proj_gweventhandlers/blemesh_event_handler.cpp:102 (discriminator 3)
0x402d65cd: handler_execute at /home/marco/Workspace/esp-homekit-sdk/esp-idf/components/esp_event/esp_event.c:147
0x4023cacd: esp_event_loop_run at /home/marco/Workspace/esp-homekit-sdk/esp-idf/components/esp_event/esp_event.c:553 (discriminator 3)
0x4023cb89: esp_event_loop_run_task at /home/marco/Workspace/esp-homekit-sdk/esp-idf/components/esp_event/esp_event.c:115
0x40093ddd: vPortTaskWrapper at /home/marco/Workspace/esp-homekit-sdk/esp-idf/components/freertos/port.c:143
As you may see above, the crash site reported is deep inside GCC’s stack unwinding code, and in particular it seems like C++ exceptions are somewhat involved in this. We are almost sure this could not caused by an exception escaping due to no mention of std::terminate()/ std::abort() being invoked.
We already stumbled in issues similar to this one several times before, and we’ve never been able to pinpoint the exact reason why it happens. We’ve had a hard time reproducing this issue reliably and we saw it popping out in several parts of our code; we also noticed that shuffling the code around a bit helped reducing (but not mitigating) the issue (i.e. trying to change the order functions are invoked, where exceptions are catch()ed, etc).
In particular, we noticed that when this issue occours, the situation is often very similar to the following:
- all stack traces have a frame 1 (_Unwind_RaiseException in unwind.inc) and 2 (__cxa_throw in eh_throw.cc) identical or very similar to the one I posted above:
0x402d05ed: _Unwind_RaiseException at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/gcc/libgcc/unwind.inc:140
0x40296b63: __cxa_throw at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:90
- the third frame refers to the last/only statement of a C++ function:
void set_response_ble_change([...]) {
some_other_function(...); // this is hapbridge_callbacks_notify_ble.cpp:438
}
- the aforementioned function is being called inside of a try {} catch() with a sigle statement in frame 4:
void blemesh_set_report_handler([...]) {
// ...
try {
hapbridge::set_response_ble_change( // this is blemesh_event_handler.cpp:102
[...]
);
} catch (hapbridge::exception &e) {
ESP_LOGE(...);
}
The project is composed of a lot of components written in C++17, and several of them rely on C++ exceptions.
Expected Behavior
The system does not crash, or the crash can be clearly tracked to an underlying cause in our application code.
Actual Behavior
The system crashes, and the generated backtrace is not helpful at resolving the issue.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 3
- Comments: 21
@mcilloni Thanks for the detailed report! We’ve been working on this issue and recently developed a workaround which mitigates this error in all our tests. This workaround should come up soon on master.
The problem lies in fact inside the libgcc unwinding code and occurs during catching an exception. In the code restoring the context of the catch to resume execution from there, it may happen that the registers A4-A7 get mixed up due to a window-underflow cpu exception. At the point of the exception, the code there will try to load from the address stored in register a7, which is 0x60023 in your case, which is not valid memory. Hence the LoadProhibited cpu exception.
Here’s the place where the cpu exception happens: https://github.com/espressif/gcc/blob/esp-develop/libgcc/unwind.inc#L140 This is actually this macro: https://github.com/espressif/gcc/blob/esp-develop/libgcc/config/xtensa/unwind-dw2-xtensa.c#L486, which calls the function
uw_install_context_1, which installs the context on the stack.