zephyr: Intel CAVS: Failure in tests/lib/spsc_pbuf
Describe the bug On intel_adsp_cavs25 platform, tests/lib/spsc_pbuf/ failed.
Please also mention any information which could help others to understand the problem you’re facing:
- intel_adsp_cavs25
- tests/lib/spsc_pbuf No Console Output(Timeout)
To Reproduce Steps to reproduce the behavior:
- twister -W --hardware-map /home/ztest/cavs/cavs.map --device-testing -x=CONFIG_BOOT_DELAY=500 -T tests/lib/spsc_pbuf/ --no-skipped-report -vv
- See error
Logs and console output
START - test_stress
ASSERTION FAIL [0] @ WEST_TOPDIR/zephyr/kernel/sched.c:1764
aborted _current back from dead
E: ** FATAL EXCEPTION
E: ** CPU 0 EXCCAUSE 63 (zephyr exception)
E: ** PC 0xbe0124ee VADDR (nil)
E: ** PS 0x60b20
E: ** (INTLEVEL:0 EXCM: 0 UM:1 RING:0 WOE:1 OWB:11 CALLINC:2)
E: ** A0 0xbe0120b8 SP 0xbe01cf00 A2 0x4 A3 0x9e022b5c
E: ** A4 0x1 A5 0x60b20 A6 0x1f A7 0x1
E: ** A8 0x1 A9 (nil) A10 0x1 A11 0x9e022a58
E: ** A12 0x9e0224c8 A13 0x21 A14 0xbe01cef0 A15 0x4
E: ** LBEG 0xbe013071 LEND 0xbe013087 LCOUNT (nil)
E: ** SAR 0x4
Backtrace:0xBE0124EB:0xBE01CF00 0xBE0120B5:0xBE01CF10 0xBE016FF8:0xBE01CF30 0xBE011773:0xBE01CF70
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Current thread: 0x9e0224c8 (ztress_0)
E: Halting system
0% remaining:4000 ms
0% remaining:2999 ms
0% remaining:1999 ms
0% remaining:999 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
0% remaining:0 ms
Environment (please complete the following information):
- OS: Linux
- Toolchain - SDK 14.1
- Commit SHA: fa055f743fd870809ed2a22dc41f6fb34a6eea46
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 24 (3 by maintainers)
Xtensa tests on intel_adsp have been dealing with fixes like this consistently since KERNEL_COHERENCE landed, actually. It’s, heh, just the staff that changed. 😃
But yes, the rules are: when enabled (which only happens now for SMP Xtensa platforms, but in principle other cache-incoherent architectures might behave similarly), thread stack memory is cached/incoherent and must be treated as local to the currently executing CPU. The kernel will handle flushing appropriately for you when the thread context switches. In general you shouldn’t share it with other threads at all, but it’s possible to do so as long as you pad and align your data to a full cache line (which is variable-sized, check core-isa.h, though right now all affected devices are 64 bytes) and use the Xtensa cache API (don’t use the HAL, use our code, it’s better) to flush changes and invalidate before use. There is automatic support in the kernel that detects obvious mistakes like putting spinlocks or waitq’s (i.e. typical IPC primitives) on the stack.
Stack is not made coherent - that’s the whole issue. Note that while the config option is called “kernel_coherence”, it’s actually there to deal with incoherent cache: https://docs.zephyrproject.org/latest/kconfig.html#CONFIG_KERNEL_COHERENCE.
It’s illegal to share data on the stack among the CPUs, as stack is in the cache (which is incoherent).
.bsssection is put on uncached memory, thus, doesn’t suffer the incoherence issue.Hmm, I think that this is falling in the cache incoherence trap. And based on some fixes for this issue (like 4796037cf721c6f6416d446e99ec02521542b949), it seems that having
static datain theZTRESS_EXECUTEmacro is the way to go. Can you send a PR for that?@smrtos
0cpymeans special mode where data is not copied into the buffer but rather first space within the buffer is allocated. Data is written directly to that space and then buffer is commited.