zephyr: Bluetooth: Controller: Restarting BLE scanning not always working and sometimes crashes together with periodic. adv.

Describe the bug There are 3 sub issues, all related to the same scenario hence I made one ticket. Write if you want it split.

  1. Restarting scanning with bt_le_scan_stop followed by bt_le_scan_start sometimes fails. bt_le_scan_start returns success, but scanning either not started or just a few events are received before it stops reporting anything from controller (maybe similar to https://github.com/zephyrproject-rtos/zephyr/issues/49915). Doing an additional disable/enable when this happens makes it working again.
  2. There is a crash in Scan disable...ASSERTION FAIL [!aux->parent] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/ull_scan.c:647 sometimes when calling bt_le_scan_stop
  3. Sometimes bt_le_scan_stop returns -5 (I/O error) in the sample application, I don’t think it should.
  • What target platform are you using? nRF52833, Zephyr
  • Is this a regression? Not tested this scenario before.

To Reproduce Have 10+ tags sending per. adv. with 7.5ms interval and CTE enabled. Build and flash the application below. Then wait…

Application: aoa_receiver_multiple-gh-50432.zip

Relevant configs (see attached application for more info):

CONFIG_BT_CTLR_ADVANCED_FEATURES=y
CONFIG_BT_CTLR_SCAN_AUX_SET=2

CONFIG_BT_CTLR_SYNC_PERIODIC_SKIP_ON_SCAN_AUX=y

CONFIG_BT_CTLR_SCAN_AUX_SYNC_RESERVE_MIN=n

CONFIG_BT_TICKER_LOW_LAT=n
CONFIG_BT_TICKER_PRIORITY_SET=n
CONFIG_BT_CTLR_SCAN_UNRESERVED=n # Scan only when radio is free.

Expected behavior Scanning enable always works and no crash when scanning is turned off.

Impact An application cannot know if there are no devices in the area or if bt_le_scan_start did not work.

Logs and console output

  1. Log for scan enable not working:
...
Queued: 7 Synced: 4 Num err: 0/62
[00:00:03.413,879] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...Success.
Start scanning...success
[00:00:03.617,706] <wrn> main: RESTARTED!
Scan is running...
Running...
TAGS:sync...,26,sync...,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 7 Synced: 1 Num err: 0/26
[00:00:04.618,286] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...Success.
Scan is running...
Start scanning...success
[00:00:04.821,289] <wrn> main: RESTARTED!
Running...
TAGS:sync...,sync...,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 7 Synced: 0 Num err: 0/0
Running...
TAGS:sync...,sync...,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 7 Synced: 0 Num err: 0/0
... Nothing happens
  1. Log for the crash:
...
Queued: 11 Synced: 6 Num err: 0/79
Running...
TAGS:18,sync...,sync...,0,20,sync...,12,19,20,sync...,20,sync...,12,14,11,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 10 Synced: 10 Num err: 0/146
[00:01:33.819,915] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...Success.
Start scanning...success
[00:01:34.024,810] <wrn> main: RESTARTED!
Scan is running...
Running...
TAGS:sync...,sync...,sync...,sync...,28,sync...,sync...,sync...,1,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 9 Synced: 2 Num err: 0/29
Scan is running...
Running...
TAGS:9,18,sync...,7,54,sync...,20,sync...,13,sync...,sync...,1,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 6 Synced: 7 Num err: 0/122
Running...
TAGS:16,16,14,10,17,sync...,15,10,8,sync...,1,16,2,sync...,sync...,4,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 5 Synced: 12 Num err: 0/131
[00:01:37.027,923] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...ASSERTION FAIL [!aux->parent] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/ull_scan.c:647
[00:01:37.028,411] <err> os: r0/a1:  0x00000003  r1/a2:  0x00000005  r2/a3:  0x00000004
[00:01:37.028,411] <err> os: r3/a4:  0x00000287 r12/ip:  0x200027ac r14/lr:  0x0000da55
[00:01:37.028,442] <err> os:  xpsr:  0x41000000
[00:01:37.028,442] <err> os: Faulting instruction address (r15/pc): 0x0000da60
[00:01:37.028,472] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
[00:01:37.028,533] <err> os: Current thread: 0x20001c40 (BT TX)
[00:01:37.107,849] <err> os: Halting system

Environment (please complete the following information):

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 61 (61 by maintainers)

Most upvoted comments

@cvinayak I’m testing and so far looking good. I’m running our main application and so far not seen any crashes redgarding CONFIG_BT_CTLR_SCAN_AUX_SET=2 and no scan stopped. I’ll be testing more in the coming hours and will come back with results when done.

@cvinayak All seems good! I’m mostly testing so there are no regressions, have tests running over night. Thanks for nailing it down 😃

@jakkra hope your testing has not reproduce the scan stop in this issue? and this issue is closed.

@jakkra I am able to reproduce the issue (i remove my debug prints), I will continue my investigations: image

@cvinayak Sorry, yes of course interval_min, interval_max is 0x0020 => 20ms Per. adv. int. is 7.5ms.

@cvinayak Good news, I can now reproduce inside a shielded box with 7 tags. I made some changes. In the attached .zip you have the patch in aoa_receiver_multiple, .hex for your nrf52833-dk tags and a log.

What I changed:

  • Tags advertise an extra legacy adv. at 20ms.
  • Tags advertise ext. adv. at 20ms.
  • Receiver does not restart scanning and does not initialize any syncs.

gh-50432-22-09-28-15-08_no_sync.zip

With this setup I’m able to get scanning to stop often within 1 minute.

@cvinayak I see, I’ll grab a radio shielded box and play around with the transmitters and receiver and see if I can get a way to reproduce faster.

@jakkra I have been running for over an hour now… may be my advertisers the default samples/bluetooth/direction_finding_connectionless_tx are not exercising the problem on the receiver… I will switch to using your advertiser on my 7 boards. Will keep you posted.

20220928_2.log

@cvinayak Here is a log where it happend 3 times in a row, the first two ones I did not wait 10 min after scan stopped, but after the last reset I waited about 5 min. I also changed to print every 100 scan_recv instead of 500 if that may give more info with more prints. gh-50432-22-09-28-09.txt