zephyr: Crash in ull.c when stressing periodic advertising sync (scanner side)

Describe the bug Having 10 periodic advertisers sending periodic advertisements at a “faster” rate triggers various asserts often:

ASSERTION FAIL [0] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/ull.c:1195

Edit: I saw another crash using this app: ASSERTION FAIL [e] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/nordic/lll/lll_sync.c:892

To Reproduce Steps to reproduce the behavior:

  1. Make sure to cherry-pick this commit (unless it’s been merged) 06d7e47e58580b17e1d04ed94f0525209752cae8 (pull) otherwise the application will hang every now and then (you can still get the crash both with and without that, just easier with as it won’t hang the test app)
  2. Build the attached code (pretty much samples/bluetooth/direction_finding_connectionless_tx) and samples/bluetooth/direction_finding_connectionless_rx. Note that for some reason CONFIG_BT_CTLR_DF_PER_SCAN_CTE_NUM_MAX=1 in prj.conf does not stick, so when you compile you will get bss overflow. You need to manually go in and change to =1 in the generated .config in the build folder and build again. Not sure if it’s a bug or something weird in the prj.conf, maybe you can check if that’s a bug.
  3. Flash one device with rx code and 10+ with tx code (less may work, but will take longer).
  4. Wait for the receiver code to crash.

Expected behavior No crashes

Impact Moderate, it is related with periodic advertising only.

Logs and console output

Environment (please complete the following information):

  • OS: (e.g. Linux, MacOS, Windows): Windows
  • Toolchain (e.g Zephyr SDK, …): Zephyr SDK
  • Commit SHA or Version used: a08e6e1d9ad22920b7a08a366b3d66eaef7440f1 (with this pull cherry-picked)

Additional context Code to reproduce: reproduce.zip

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 48 (43 by maintainers)

Most upvoted comments

@saleh-unikie Thanks for the quick testing and quick response on confirming that assertion issue is resolved by using the work in PR #41091

I have been running some long duration tests and discovered couple more development assertions related to race when terminate is initiated by application. I have not pushed the latest changes yet, will do next week and I would like you and @jakkra to do respective testing and revert back if he still is facing issues.

I have not started analysis of #41465 yet, but hopefully I should be able to start on it in week 3.

@jakkra

Ah I see, missed that, maybe could be moved from direction_finding_connectionless_rx/boards/nrf52833dk_nrf52833.conf to the prj.conf to avoid confusion like this.

The controller configuration is deliberately stored in boards/*.conf because for nRF53 it will be part of child image flashed to NET core. In such case it will cause problem with build and flash of the application that goes for APP core.

I’d be thankful if you check the lowest required number of DKs required. I have 4nRF52 but got two nRF53 so can try to use them also.