zephyr: Bluetooth: Controller: Restarting BLE scanning not always working and sometimes crashes together with periodic. adv.
Describe the bug There are 3 sub issues, all related to the same scenario hence I made one ticket. Write if you want it split.
- Restarting scanning with
bt_le_scan_stopfollowed bybt_le_scan_startsometimes fails.bt_le_scan_startreturns success, but scanning either not started or just a few events are received before it stops reporting anything from controller (maybe similar to https://github.com/zephyrproject-rtos/zephyr/issues/49915). Doing an additional disable/enable when this happens makes it working again. - There is a crash in
Scan disable...ASSERTION FAIL [!aux->parent] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/ull_scan.c:647sometimes when callingbt_le_scan_stop - Sometimes
bt_le_scan_stopreturns -5 (I/O error) in the sample application, I don’t think it should.
- What target platform are you using? nRF52833, Zephyr
- Is this a regression? Not tested this scenario before.
To Reproduce Have 10+ tags sending per. adv. with 7.5ms interval and CTE enabled. Build and flash the application below. Then wait…
Application: aoa_receiver_multiple-gh-50432.zip
Relevant configs (see attached application for more info):
CONFIG_BT_CTLR_ADVANCED_FEATURES=y
CONFIG_BT_CTLR_SCAN_AUX_SET=2
CONFIG_BT_CTLR_SYNC_PERIODIC_SKIP_ON_SCAN_AUX=y
CONFIG_BT_CTLR_SCAN_AUX_SYNC_RESERVE_MIN=n
CONFIG_BT_TICKER_LOW_LAT=n
CONFIG_BT_TICKER_PRIORITY_SET=n
CONFIG_BT_CTLR_SCAN_UNRESERVED=n # Scan only when radio is free.
Expected behavior Scanning enable always works and no crash when scanning is turned off.
Impact
An application cannot know if there are no devices in the area or if bt_le_scan_start did not work.
Logs and console output
- Log for scan enable not working:
...
Queued: 7 Synced: 4 Num err: 0/62
[00:00:03.413,879] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...Success.
Start scanning...success
[00:00:03.617,706] <wrn> main: RESTARTED!
Scan is running...
Running...
TAGS:sync...,26,sync...,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 7 Synced: 1 Num err: 0/26
[00:00:04.618,286] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...Success.
Scan is running...
Start scanning...success
[00:00:04.821,289] <wrn> main: RESTARTED!
Running...
TAGS:sync...,sync...,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 7 Synced: 0 Num err: 0/0
Running...
TAGS:sync...,sync...,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 7 Synced: 0 Num err: 0/0
... Nothing happens
- Log for the crash:
...
Queued: 11 Synced: 6 Num err: 0/79
Running...
TAGS:18,sync...,sync...,0,20,sync...,12,19,20,sync...,20,sync...,12,14,11,sync...,sync...,sync...,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 10 Synced: 10 Num err: 0/146
[00:01:33.819,915] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...Success.
Start scanning...success
[00:01:34.024,810] <wrn> main: RESTARTED!
Scan is running...
Running...
TAGS:sync...,sync...,sync...,sync...,28,sync...,sync...,sync...,1,sync...,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 9 Synced: 2 Num err: 0/29
Scan is running...
Running...
TAGS:9,18,sync...,7,54,sync...,20,sync...,13,sync...,sync...,1,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 6 Synced: 7 Num err: 0/122
Running...
TAGS:16,16,14,10,17,sync...,15,10,8,sync...,1,16,2,sync...,sync...,4,sync...,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,
Queued: 5 Synced: 12 Num err: 0/131
[00:01:37.027,923] <wrn> main: RESTARTING SCAN AND SYNC...
Scan disable...ASSERTION FAIL [!aux->parent] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/ull_scan.c:647
[00:01:37.028,411] <err> os: r0/a1: 0x00000003 r1/a2: 0x00000005 r2/a3: 0x00000004
[00:01:37.028,411] <err> os: r3/a4: 0x00000287 r12/ip: 0x200027ac r14/lr: 0x0000da55
[00:01:37.028,442] <err> os: xpsr: 0x41000000
[00:01:37.028,442] <err> os: Faulting instruction address (r15/pc): 0x0000da60
[00:01:37.028,472] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
[00:01:37.028,533] <err> os: Current thread: 0x20001c40 (BT TX)
[00:01:37.107,849] <err> os: Halting system
Environment (please complete the following information):
- OS: Windows
- Toolchain: Zephyr
- Commit SHA or Version used: (main as of now) e1cb0845b49b6a4100c9e5558d37667b92f0d000 with 894423e9aed34a394b97c995c33549513b0ac4a5
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 61 (61 by maintainers)
@cvinayak I’m testing and so far looking good. I’m running our main application and so far not seen any crashes redgarding
CONFIG_BT_CTLR_SCAN_AUX_SET=2and no scan stopped. I’ll be testing more in the coming hours and will come back with results when done.@cvinayak All seems good! I’m mostly testing so there are no regressions, have tests running over night. Thanks for nailing it down 😃
@jakkra hope your testing has not reproduce the scan stop in this issue? and this issue is closed.
@jakkra I am able to reproduce the issue (i remove my debug prints), I will continue my investigations:
@cvinayak Sorry, yes of course
interval_min, interval_maxis 0x0020 => 20ms Per. adv. int. is 7.5ms.@cvinayak Good news, I can now reproduce inside a shielded box with 7 tags. I made some changes. In the attached .zip you have the patch in aoa_receiver_multiple, .hex for your nrf52833-dk tags and a log.
What I changed:
gh-50432-22-09-28-15-08_no_sync.zip
With this setup I’m able to get scanning to stop often within 1 minute.
@cvinayak I see, I’ll grab a radio shielded box and play around with the transmitters and receiver and see if I can get a way to reproduce faster.
@jakkra I have been running for over an hour now… may be my advertisers the default samples/bluetooth/direction_finding_connectionless_tx are not exercising the problem on the receiver… I will switch to using your advertiser on my 7 boards. Will keep you posted.
20220928_2.log
@cvinayak Here is a log where it happend 3 times in a row, the first two ones I did not wait 10 min after scan stopped, but after the last reset I waited about 5 min. I also changed to print every 100 scan_recv instead of 500 if that may give more info with more prints. gh-50432-22-09-28-09.txt