zephyr: SPI devices fail after H7 SPI FIFO introduction
The SPI devices started to fail (periodically) on STM32H730 after the merge of #63173 where the issue discussion has started.
The SPI driver gets stuck in spi_stm32_complete after data reception due to ll_func_spi_is_busy being true all the time.
Different SPI peripherals fail at different times. To trigger the issue the board has to be restarted multiple times. When it gets stuck it is always within spi_stm32_complete. Sometimes it works, but then devices fail due to corruption of transmitted and/or received data.
The CR2 is 0, obviously the SPI_CR2_TSIZE is 0 as well. The SR (checked in LL_SPI_IsActiveFlag_TXC) is usually 0, but there were cases of it being 2. The TXC bit is 0. Because TXC is 0, the ll_func_spi_is_busy returns true and it loops forever. At that stage nothing happens on the SPI lines. The context rx/tx count, len are 0 as well as rx/tx buffers are set to 0x0.
After reducing the clock the issue started to appear every time. With higher clock the issue required board restart to trigger it. Sized down connected zoo to one device per SPI (to eliminate possible issue(s) from having multiple devices).
SPI without FIFO, the device works:
SPI with FIFO, the device doesn’t work and it is stuck as described:
It is clearly visible that the CS (chip select) is dramatically off. It is software controlled by the driver. Most probably that is where the problems are coming from.
Impact The #63173 breaks previously working SPI. At least on H730.
Environment:
- Toolchain: arm gnu toolchain 12.3 REL1
- Zephyr version is pretty much up to date: f69641f7d204864aa26f8bdd9fecab259e535da2.
Additional context It is a custom device based on H730. Multiple SPIs are used where each has multiple targets (slaves). I still have to check if existing loopback test would detect the problem.
About this issue
- Original URL
- State: closed
- Created 7 months ago
- Comments: 28 (13 by maintainers)
Yes, TSIZE and EOT.
@erwango Yes, I will as soon as I can.
@erwango ordered nucleo h723zg. Hopefully, the issue is reproducible there (closest available mcu to the h730 within publicly available board).
Running the “test” code above and observing the signals shows that something is definitely off. Or do I use the API incorrectly? Unfortunately, from the test perspective, both buffers content is going to be equal; and equality check will pass.
@MaureenHelm Hi!
In response to Maureen’s priority: low I’d like to add that for me the SPI is unusable since the FIFO merge, it gets stuck where the Issue describes
not sometimes but every single boot. It freezes. I have to set
spi_stm32_can_use_fifo()to false to get it to work properly. In addition, I get regularly SPI overrun error on TX-only mode.My point is, the priority doesn’t feel low, with all due respect and all that. Thanks!
Using STM32H743BIT6