zephyr: entropy_stm32_rng_get_entropy sometimes blocks forever in Zephyr 3.6.0

Target platform: STM32H743 Zephyr 3.6.0

Our application uses the Zephyr MQTT client. Since upgrading to 3.6.0 the MQTT thread sometimes hangs on calling mqtt_connect. Using a debugger I can see that the thread is stuck waiting on entropy_stm32_rng_data.sem_sync.

The problem happens quite reliably, the application retries mqtt_connect every 500 ms and almost always gets stuck in the first 10 attempts. It seems to happen the first time sys_rand32_get is called when the entropy pool is empty (usually in zsock_connect called from find_available_port, tcp_derive_rto or tcp_init_isn).

The problem is sensitive to timing, adding debug output can make it go away.

A manual git bisect (of just drivers/entropy/entropy_stm32.c) showed that the problem appeared in commit 78af1c1.

Commenting out the bodies of the entropy_stm32_suspend and entropy_stm32_resume functions makes the problem go away.

Using the debugger it seems like the stm32_rng_isr isn’t being called once the problem happens. I can see that the IRQ is enabled and that the RNG CR register has the IE bit set and the SR register has DRDY set.

Impact

This breaks networking in our application. Once mqtt_connect is stuck we can’t connect with telnet or send commands with mcumgr (UDP). The only way to recover is to reboot the MCU.

About this issue

  • Original URL
  • State: closed
  • Created 3 months ago
  • Comments: 31 (30 by maintainers)

Most upvoted comments

You would need to test it anyway, so I think it makes sense that you send a PR. Is that okay?

No problem and thanks for all the help with this.

Would you be able to check if this function generate_from_isr is ever called?

I just ran a quick test and I never see that function being called.