zephyr: Bluetooth Ext Adv:Sync: While simultaneous advertiser are working, and skip is non-zero, sync terminates repeatedly
Describe the bug Using the https://github.com/zephyrproject-rtos/zephyr/pull/41091 (force-pushed on 2 Feb), I tested 10 simultaneous advertiser (AoA tag) with one scanner (AoA locator). The source codes are based on “connection-less direction finding” examples, by adding simultaneous tag support to scanner code. The advertiser are working with extended adv interval = 80ms, legacy adv interval = 150ms. All devices are nRF52833DK board.
Successful Scenario: If at receiver side, “skip” parameter is a non-zero value, then “synchronization” is made successfully and the probability of losing the sync is low.
struct bt_le_per_adv_sync_param param;
param.skip = 1;
param.timeout = 2000;
...
bt_le_per_adv_sync_create(¶m, synch);
The output log is something like this:
try to sync tag1
tag1 synced
try to sync tag2
tag2 synced
try to sync tag3
tag3 synced
try to sync tag4
tag4 synced
try to sync tag5
tag5 synced
try to sync tag6
tag6 synced
try to sync tag7
tag7 synced
try to sync tag8
tag8 synced
try to sync tag9
tag9 synced
try to sync tag10
tag10 synced
(after 30minutes no sync lost was seen)
Failure Scenario: But if, “skip” parameter is set to zero, at the beginning of the program, syncs are made normally like above condition, but after a while it start to lose the sync with tags, and it usually must retries more than once to make the sync again. in most retries, it terminate eventually exactly after calling the bt_le_per_adv_sync_create function.
struct bt_le_per_adv_sync_param param;
param.skip = 0;
param.timeout = 2000;
...
bt_le_per_adv_sync_create(¶m, synch);
The output log could be:
try to sync tag1
tag1 synced
try to sync tag2
tag2 synced
try to sync tag3
tag3 synced
try to sync tag4
tag4 synced
try to sync tag5
tag5 synced
try to sync tag6
tag6 synced
try to sync tag7
tag7 synced
try to sync tag8
tag8 synced
try to sync tag9
tag9 synced
try to sync tag10
tag10 synced
tag3 sync lost
try to sync tag3
tag3 sync lost
try to sync tag3
tag3 sync lost
try to sync tag3
tag3 sync lost
try to sync tag3
tag3 synced
(the above retries occur for other tags too and it is very common to lose the sync,
which makes the whole system functionality unstable)
In both cases timeout value is same. I supposed that “skip” parameter can not affect on sync termination as I asked this question in nordic devzone, but seems it is not. https://devzone.nordicsemi.com/f/nordic-q-a/84232/more-information-about-skip-and-timeout-parameters-of-ble-5-advertising
To Reproduce Steps to reproduce the behavior:
- use “connection-less direction finding” examples
- Add simultaneous tag support to the scanner source code
- west build -b nrf52833dk_nrf52833
- west flash
- See error
Expected behavior The termination is only dependent on “timoute” parameter, changing “skip” should not change the behavior of losing sync between scanner and advertiser.
Impact System is not stable when skip=0 and multiple advertiser try to sync with the scanner.
Logs and console output as you can see in the bug description (above)
Environment (please complete the following information):
- OS: Linux
- Toolchain: Zephyr SDK
- Commit SHA or Version used: https://github.com/zephyrproject-rtos/zephyr/pull/41091/commits/9749514d83887d1dedde9966a0dc2a5d401aca97
Additional context In addition to the above problem, sometime this assertion failure happens too
ASSERTION FAIL [next] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/nordic/lll/lll.c:649
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 26 (22 by maintainers)
@saleh-unikie we will address this as soon as possible.
@saleh-unikie Thank you and appreciate the quick responses in helping me slowly resolve a few memory leaks and implementation defects on the way to here!
I have run your rx_multiple sample for over 8 hours today without terminates or assertions. I believe the terminate issue is resolved. I will keep an open mind on assertions, and will let my boards continue to maintain the sync over the weekend.
I have noticed that if you enable
CONFIG_LOG=yin your rx multiple sample, the stack logs error preparing the CTE samples. We are looking into possible uninitialized values in host around thecte_typesfiltering. Will post PRs after further investigations.Please close this PR, you can re-open or create new issue if you discover the original problem reported in this issue being not resolved.
Sure, I will do it tomorrow and will report here.
Good news! I’ve used these two projects for generating the above logs. Let me know if there was any problem or any help needed. Good luck!
Thank you, once again.
We are going in the right direction. (I believe the assertion could still happen, as I have done nothing to fix it, if it is a bug, it will recur). I have switch my focus to reviewing implementation when CTE enabled.
(I wish I had 11 nRF52833 DKs)