zephyr: ESP32-S3/C3 Fatal Exception output not sent when using USB Serial (ACM) Mode

Describe the bug If there is a fatal exception (e.g. stack overflow) when using the USB serial module instead of the UART serial output, then the panic message is not displayed so the system appears to just halt. Debugging is made a lot more difficult since a debugger must be attached to debug the system and a lot of the context information must be manually extracted.

For example, failure stackcheckfailure here calls log_panic(); StackCheckHandler(); and with the shell being used over USB serial, the system appears to just freeze and quit responding:

uart:~$ failure stackcheckfailure 
# The system appears to have locked up at this point

However, if the same command is run over UART0, then everything works correctly:

uart:~$ failure stackcheckfailure 
[00:10:07.585,000] <err> os:  ** FATAL EXCEPTION
[00:10:07.585,000] <err> os:  ** CPU 0 EXCCAUSE 63 (zephyr exception)
[00:10:07.588,000] <err> os:  **  PC 0x40379213 VADDR (nil)
[00:10:07.595,000] <err> os:  **  PS 0x60720
[00:10:07.600,000] <err> os:  **    (INTLEVEL:0 EXCM: 0 UM:1 RING:0 WOE:1 OWB:7 CALLINC:2)
[00:10:07.610,000] <wrn> cbprintf_package: (unsigned) char * used for %p argument. It's recommended to cast it to void * because it may cause misbehavior in certain configurations. String:" **  A0 %p  SP %p  A2 %p  A3 %p" argument:1
[00:10:07.632,000] <err> os:  **  A0 0x8037d7c4  SP 0x3fc9aca0  A2 0x2  A3 0x8b524d00
[00:10:07.641,000] <err> os:  **  A4 0x1  A5 0x8b524d00  A6 (nil)  A7 0x8b524d00
[00:10:07.650,000] <err> os:  **  A8 0x8b524d00  A9 0x8b524d00 A10 (nil) A11 0x8b524d00
[00:10:07.659,000] <err> os:  ** A12 0x8b524d00 A13 0x3fc965cc A14 0x3c001eb8 A15 0x60b20
[00:10:07.669,000] <err> os:  ** LBEG (nil) LEND (nil) LCOUNT (nil)
[00:10:07.677,000] <err> os:  ** SAR 0x15
[00:10:07.682,000] <err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
[00:10:07.691,000] <err> os: Current thread: 0x3fc93a80 (shell_uart)
[00:10:07.698,000] <err> os: Halting system

To Reproduce Using an ESP32-S3 board with USB connected to the USB Serial/JTAG connector, change the console to usb_serial.

        chosen {
                zephyr,sram = &sram0;
-               zephyr,console = &uart0;
-               zephyr,shell-uart = &uart0;
+               zephyr,console = &usb_serial;
+               zephyr,shell-uart = &usb_serial;
                zephyr,flash = &flash0;
        };

+&usb_serial {
+       status = "okay";
+};

And then trigger an exception. I called log_panic(); StackCheckHandler(); for the examples.

Expected behavior Exception reports should be printed to the USB serial port.

Impact Causes difficulties in identifying the root cause of failures.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 20 (6 by maintainers)

Commits related to this issue

Most upvoted comments

Thanks for being so helpful!

To avoid all changes, you will need to create a new branch off of main and then cherry-pick your fix on top and push that. At the command line, it will look like this:

# here are the remotes.  "mygithub" is your personal github branch of the Zephyr repo.  "upstream" is the actual Zephyr repo.
git remote -vv show
mygithub	https://github.com/YOUR_GITHUB_ACCOUNT/zephyr.git (fetch)
mygithub	https://github.com/YOUR_GITHUB_ACCOUNT/zephyr.git (push)
upstream	https://github.com/zephyrproject-rtos/zephyr (fetch)
upstream	https://github.com/zephyrproject-rtos/zephyr (push)

# create a new branch based upon the latest zephyr main
git fetch upstream
git checkout main
git rebase upstream/main
git checkout -b fixes-60825-usb-fatal-exception-output

# cherry-pick your development change onto this branch
git cherry-pick HASH_OF_YOU_CHANGE_FROM_YOUR_DEV_BRANCH

# push the change to your github fork
git push mygithub

# if you make edits, then do a forced push
git push mygithub -f

There are different ways you can do this that are a little less verbose, but this one is easy to follow.

You will eventually also want to run checkpatch on more complex changes (see https://docs.zephyrproject.org/latest/contribute/guidelines.html#coding-style). You can probably skip that for this one until you get the hang of the rest of the process. I have mine setup to run it before pushing FWIW.

Great! @havoc-dlt, can you open a PR?

Well, I’ve never done one before. Consider me an absolute newbie in github collaboration… 😕 I’ll read up on it and give it a try.

I tried both versions. @havoc-dlt’s change works. For some reason @sylvioalves’s version is printing garbage – looks like some characters are getting dropped:

uart:~$ debug stack_panic 
[3> [r>V 0L[Id ia3i[4> xe9[>aa
24 )[>0[>C>500m07>0>00c

@havoc-dlt, Can you check whether the below also fix it? I haven’t tested this yet.

while (!usb_serial_jtag_ll_txfifo_writable() &&
		s_usbserial_timeout < (USBSERIAL_TIMEOUT_MAX_US / 100) &&
		!k_is_in_isr()) {

For convenience, here’s my patch for this change:

From 2e894e7d3f9ee4ec923ed6f23e13468f16b0fa6f Mon Sep 17 00:00:00 2001
From: "havoc-dlt" <havoc.dlt@gmail.com>
Date: Wed, 16 Aug 2023 09:49:12 +0200
Subject: [PATCH] esp32: fix fatal exception output in usb serial mode

---
 drivers/serial/serial_esp32_usb.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/serial/serial_esp32_usb.c b/drivers/serial/serial_esp32_usb.c
index 4a2fe24324..714034f1d6 100644
--- a/drivers/serial/serial_esp32_usb.c
+++ b/drivers/serial/serial_esp32_usb.c
@@ -68,10 +68,14 @@ static void serial_esp32_usb_poll_out(const struct device *dev, unsigned char c)
 	ARG_UNUSED(dev);
 
 	/* Wait for space in FIFO */
+	const int64_t timeout =
+	k_uptime_ticks() +
+	(int64_t)k_us_to_ticks_ceil64(USBSERIAL_TIMEOUT_MAX_US);
 	while (!usb_serial_jtag_ll_txfifo_writable() &&
-		s_usbserial_timeout < (USBSERIAL_TIMEOUT_MAX_US / 100)) {
-		k_usleep(100);
-		s_usbserial_timeout++;
+		k_uptime_ticks() < timeout) {
+		if (!k_is_in_isr()) {
+			k_usleep(100);
+		}
 	}
 
 	if (usb_serial_jtag_ll_txfifo_writable()) {
-- 
2.41.0

I might have some insight on this if nobody else already figured it out. In drivers/serial/serial_esp32_usb.c, function “serial_esp32_usb_poll_out”, there is a k_usleep call that gets blocked by an __ASSERT when this function gets called from an interrupt.

/* Wait for space in FIFO */
while (!usb_serial_jtag_ll_txfifo_writable() &&
		s_usbserial_timeout < (USBSERIAL_TIMEOUT_MAX_US / 100)) {
        k_usleep(100);
	s_usbserial_timeout++;
}

My workaround is this for now, which is ugly, because more power hungry obviously, but we’re trying to push out everything after a crash:

/* Wait for space in FIFO */
const int64_t timeout =
        k_uptime_ticks() +
        (int64_t)k_us_to_ticks_ceil64(USBSERIAL_TIMEOUT_MAX_US);
while (!usb_serial_jtag_ll_txfifo_writable() &&
                k_uptime_ticks() < timeout) {
        if (!k_is_in_isr()) {
            k_usleep(100);
        }
}

Seems to work for me at least.

@EricNRS, can you provide your shell sample code so I can reproduce the exactly same behavior?