zephyr: Occasional Spinlocks on zephyr 2.4.0 (ASSERTION FAIL [z_spin_lock_valid(l)] @ WEST_TOPDIR/zephyr/include/spinlock.h:92)

Describe the bug We have an application based on a nrf52840. We see z_spin_lock_valid assertion every few hours: ASSERTION FAIL [z_spin_lock_valid(l)] @ WEST_TOPDIR/zephyr/include/spinlock.h:92.

I haven’t found much information about the reasons for such an issue. The code is used for a measurement device. We run an external ADC at 10 kSamples/s and read its data through SPI DMA transfers. After 30 samples (=> 333 Hz) we trigger an Interrupt Handler which copies the data into another buffer. Then we we release a semaphore. A Kernel Thread is waiting for this semaphore so it can grab the data from the buffer.

Code snippets:

K_SEM_DEFINE(dataReadyToBeSend, 0, 1);
K_THREAD_DEFINE(DataStreaming, 10240, Thread_DataStreaming, NULL, NULL, NULL, 0, 0, 0);

void TIMER2_IRQHandler(nrf_timer_event_t event_type, void *p_context) {
    [Copy data from DMA buffer to buffer...]
    k_sem_give(&dataReadyToBeSend);
}

static void Thread_DataStreaming(void) {
    [..]
    while (1) {
        semRet = k_sem_take(&dataReadyToBeSend, K_MSEC(1000));
        if (semRet == 0) {
            [..]
        }
        else {
            [..]
        }
    }
}

Log:

[..]
ASSERTION FAIL [z_spin_lock_valid(l)] @ WEST_TOPDIR/zephyr/include/spinlock.h:92

Recursive spinl*** Booting Zephyr OS build zephyr-v2.4.0  ***

What is the best way to trace this further down?

Environment :

  • OS: Ubuntu 20.04,
  • Toolchain: zephyr-sdk-0.11.4
  • Zephyr Project (Tag 2.4.0)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 21 (20 by maintainers)

Most upvoted comments

I refactored my initialization code. This made the crashs which occured immediately after startup to go away. It is now quite stable, how ever I still see occational spin lock crashs. I will have to invest this further once I have more time.

My root cause was the interrupt controller feeding garbage to the GIC, which was then mishandled by Zephyr. A separate issue was filed for that one. I do believe there still might be an issue with the logging system not initializing a lock, but I have no resources available to give you a firm answer.

@carlescufi Sorry. Forgot about this. I believe my root cause was an issue with the memory controller that occasionally dropped writes. Fixed in hw.