zephyr: acrn_ehl_crb: test case of arch.interrupt.prevent_interruption failed
Describe the bug Test case arch.interrupt.prevent_interruption failed due it hang up. (It hangs up at calling k_busy_wait() while IRQ is locked.)
To Reproduce Steps to reproduce the behavior:
- Make sure BIOS configuration is correct for ACRN, and include/arch/x86/memory.ld line 44, LOCORE_BASE modified from 0x1000 to 0x8000.
- Run this command from the zephyrproject: twister -p acrn_ehl_crb --device-testing --device-serial-pty=“/opt/remotehw/acrn-test-pty.exp,ehlsku11” --west-flash=“/opt/remotehw/remotehw-x86-acrn.sh,ehlsku11” -vv -T tests/kernel/interrupt
- Check handler.log in twister.out in directory to see the error log.
Expected behavior The test case is passed as expected.
Impact ACRN on ehl_crb platform has little lower pass rate.
Logs and console output
The error log is like this:
*** Booting Zephyr OS build zephyr-v2.5.0-484-g23f699bd907f ***
Running test suite interrupt_feature
===================================================================
START - test_isr_dynamic
SKIP - test_isr_dynamic
===================================================================
START - test_nested_isr
SKIP - test_nested_isr
===================================================================
START - test_prevent_interruption
locking interrupts
It hangs up after printing out “locking interrupts”.
UPDATE - The test progresses with the use of newer APIC timer driver (APIC TSC Deadline timer driver): #32603.
DEBUG - DEVICE: START - test_prevent_interruption
DEBUG - DEVICE: locking interrupts
DEBUG - DEVICE: unlocking interrupts
DEBUG - DEVICE:
DEBUG - DEVICE: Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/interrupt/src/prevent_irq.c:58: test_prevent_interruption: (handler_result not equal to HANDLER_TOKEN)
DEBUG - DEVICE: timer should have fired
DEBUG - DEVICE: FAIL - test_prevent_interruption in 2.2 seconds
And the expected should be:
===================================================================
START - test_prevent_interruption
locking interrupts
unlocking interrupts
timer fired
PASS - test_prevent_interruption
===================================================================
Test suite interrupt_feature succeeded
===================================================================
PROJECT EXECUTION SUCCESSFUL
Environment (please complete the following information):
- OS: Linux
- Toolchain: zephyr-sdk-0.12.2
- Commit SHA : 2.5 Release bump (fe7c2ef), ACRN version: acrn-2020w42.1-180000p (5289a3eb9)
UPDATE -
- Toolchain: zephyr-sdk-0.12.4
- Commit SHA : 2.6 release branch
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 40 (2 by maintainers)
Are assertions not enabled for this build? Something should have bailed when you tried to sleep with interrupts locked. That’s just not legal, what is happening (illegally – again, I’m all but certain this is an ASSERT somewhere) is that you’re entering sleep and context switching away, which then enables interrupts when the next thread (the idle thread in this case) gets to run. So the timer fires, incorrectly. That’s the reason the test chose to busy wait to verify that the interrupt did not fire – there’s no other option in this context.
If this is hung in k_busy_wait() it should be easy enough to figure out where. A quick build shows that this is using drivers/timer/apic_timer.c as the timer driver, with CONFIG_APIC_TIMER_TSC unset. So it looks like the it’s reading the APIC CCR register to get the time. I’m guessing ACRN is failing to advance that register when interrupts are masked, or otherwise making a mistake.
Bisect resulted in the following as first bad commit. Seems relevant, I’ll keep looking into it- c2c6bee036bb33eabdabd0587d0c4392f4e239f6 drivers/timer: Remove legacy APIC driver Edit: Testing at this point before the acrn ehl crb definitions, this APIC driver change also requires a local change of the CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=1900000000. Still, no success resolving the issue tho. Let’s try some good old prints to learn more about current status
I also tried walking the SDK version back a few releases jic, but that did not have effect. Still investigating things.
@enjiamai np, was asking jic. Thanks for your reply! I can work on bisecting this.