esp-idf: Add the possibility to register a custom panic handler callback to intercept app crashes (IDFGH-3146)

Feature request:

I wish I could set a custom panic handler callback from my main application in order to get some additional info and intercept crashes from my main component.

What I’d like to obtain

I’m not sure if there is another and better way to do what I’m trying to achieve. I’m currently working on esp-idf v4.0 and it looks like that this feature is still missing even in the future releases currently under development.

Applying these changes to each update is time-consuming, and maybe other users could benefit from this. I would love to have it out of the box.

My current solution. Tested and working

I have edited the esp32 and freertos components in order to get the described feature, however, I wish you could consider adding something like this as part of the framework.


Changes (additions) made to esp-idf v4.0 to get the feature in question:

Notice: the links pointing to the files I’ve changed are an indicative reference of where the change was applied.

esp-idf/components/esp32/Kconfig:

config ESP32_PANIC_CALLBACK
        bool "Support registration of a user defined callback for the panic handler"
        default y
        help
            Use xt_set_error_handler_callback() to register a custom callback.
            The callback is called by the common error handler so catches exceptions,
            panics and abort() calls.

esp-idf/components/esp32/panic.c:

#if CONFIG_ESP32_PANIC_CALLBACK
/*
* Custom error handler callback registration.
*/
xt_error_handler_callback customErrorHandler = NULL;
xt_error_handler_callback xt_set_error_handler_callback(xt_error_handler_callback f)
{
  xt_error_handler_callback old = customErrorHandler;
  customErrorHandler = f;
  return old;
}
#endif //CONFIG_ESP32_PANIC_CALLBACK

esp-idf/components/esp32/panic.c commonErrorHandler():

    #if CONFIG_ESP32_PANIC_CALLBACK
    if (customErrorHandler) {
        disableAllWdts();
        customErrorHandler(frame, core_id, abort_called);
        reconfigureAllWdts();
    }
    #endif

components/freertos/include/freertos/xtensa_api.h:

/* Typedef for C-callable error handler callback function */
typedef void (*xt_error_handler_callback)(XtExcFrame *, int core_id, bool is_abort);

/*
-------------------------------------------------------------------------------
Call this function to set a callback for the standard error handler.
The callback will be called by the commonErrorHandler on all errors.

    f        - Callback function address, NULL to uninstall callback.

The callback will be passed a pointer to the exception frame, which is created
on the stack of the thread that caused the exception, the core id and
a bool to signal if abort() has been called.

The callback is called with watchdogs disabled.
-------------------------------------------------------------------------------
*/
extern xt_error_handler_callback xt_set_error_handler_callback(xt_error_handler_callback f);

Thank you in advance.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 31 (15 by maintainers)

Most upvoted comments

@nicklasb I am sorry, I just realized I haven’t replied to your previous comment.

If ESP-IDF had a simple callback, the framework would just need code for that functionality and wouldn’t have to require users to create specialized project/linker level settings to get its error handling.

Link-time wrapping can be completely encapsulated in your custom component/library, I don’t think anything would have to be done by the application developer. Basically, you can specify the additional linker arguments using a target_link_options call in your component CMakeLists file. All the user will have to do is to add your component to their project.

Then don’t. Just state that one needs to be very careful and assume that basically no FreeRTOS functionality works. No one will sue you if you have disclaimed properly there.

How useful would such a restrictive callback be for your use case, and would it enable anything that Core Dump currently doesn’t? As I mentioned above to the OP, we would very much rather add the missing data to the Core Dump, and then you can handle the crash after a restart in the sane execution environment of app_main.

(I understand that shutting down GPIOs or peripherals on reset is indeed something we don’t support. We’ll look into adding an option to do that.)

Then don’t. Just state that one needs to be very careful and assume that basically no FreeRTOS functionality works. No one will sue you if you have disclaimed properly there.

I understand your point. Perhaps not to the point of sueing, but we did face issues with customers’ use of dangerous features in the past, hence my hesitation.

Yes, will look into adding custom callback as an option. I would still prefer to understand what requirements this satisfies, because if we understand the requirements we would rather prefer to implement the missing features in Core dump / GDB stub. Regarding adding custom logic, this is currently possible without modifications to IDF code, using linker wrapping feature. For instance, Memfault SDK uses it to override IDF Core dump implementation with their custom core dumps: https://github.com/memfault/memfault-firmware-sdk/blob/70f2d4ed6884d907d4b39da99434b762209ba568/ports/esp_idf/memfault/CMakeLists.txt#L79.

simple solution Adding an effing callback is a pretty simple solution as well. 😃

It creates a lot of other opportunities, like creating higher-level frameworks that automatically reports issues, which is what I am doing.

TBH, I have not seen much in the way of strong arguments against it here, rather alternative solutions to specific problems, and that is another discussion, helpful as it may be.

Assuming you keep the state in RTC memory, this is possible with a reset (not just deep sleep) as well, if you use RTC_NOINIT_ATTR.

Argh! I mixed those up, and used the RTC_DATA_ATTR instead! I thought it was so weird that it didn’t work for me any longer as I had it working before.

A note about that is that the documentation around RTC_DATA_ATTR is a bit confusing, would be better if it was more specific and not so much about stubs and so on. It doesn’t say that this data is reset where that information is expected. In myworld, that would be the most usual use case for the RTC memory. And when googling on RTC memory I ended up there.

Could you please expand on this a bit? By portable do you mean “portable to another RTOS”? Seems like the panic handler callback wouldn’t be portable, either?

If ESP-IDF had a simple callback, the framework would just need code for that functionality and wouldn’t have to require users to create specialized project/linker level settings to get its error handling. Basically the suggested solution makes 3rd party libraries using the callback much more difficult to use and spread.

Panic handling is quite a delicate task. …Therefore documenting the exact rules for panic handler callbacks — which IDF > APIs are allowed to be called and which aren’t — is not an easy task.

Then don’t. Just state that one needs to be very careful and assume that basically no FreeRTOS functionality works. No one will sue you if you have disclaimed properly there. And if you find out that something works reliably you can document that later. Using hardware the way it is not supposed to be used has lead to a ton of great stuff, the entire 64 demo scene and almost all its games, for example.

This is the first google result for “esp32 custom panic handler”.

+1

I’d like to turn the heater off if something wrong happens. As simple as GPIO set 0.

This is the first google result for “esp32 custom panic handler”.

After all the other panic handlers run, I want to:

  • set the color of an LED
  • buzz a piezo
  • (ideally) reboot after short delay

It seems this is still not easily possible? The memfault trick might cover the LED and piezo, but it’s pretty involved to have to create a new component with special linking.

edit (March 2024): I just run all my “panic” code after reboot. Simpler. You can still access the panic info

@crisreimberg The feature to obtain the summary from the core dump is implemented, you need to call esp_core_dump_image_get on startup to check if the core dump image exists (optionally after checking the reset reason), and then call esp_core_dump_get_summary to obtain the summary information structure.

https://github.com/espressif/esp-idf/blob/c13afea635adec735435961270d0894ff46eef85/components/espcoredump/include/esp_core_dump.h#L140-L147