zephyr: Bluetooth: HCI cmd response timeout
Describe the bug
We have an application based on a nrf52840. We see k_sem_take failed with err -11 assertion when we stress the device with a lot of notifications and read requests at the same time.
The system is used for a measurement device. We run an external ADC at 10 kSamples/s and read its data through SPI DMA transfers. After 30 samples (=> 333 Hz) we trigger an Interrupt Handler which copies the data into another buffer.
Then we we release a semaphore. A Kernel Thread is waiting for this semaphore so it can grab the data from the buffer and call bt_gatt_notify() with it. DLE is enabled and the data length set to 251.
Code snippets:
K_SEM_DEFINE(dataReadyToBeSend, 0, 1);
K_THREAD_DEFINE(DataStreaming, 10240, Thread_DataStreaming, NULL, NULL, NULL, 0, 0, 0);
#define TIMER_IRQ_PRIORITY 0
void init() {
IRQ_DIRECT_CONNECT(TIMER2_IRQn, TIMER_IRQ_PRIORITY, TIMER2_IRQHandler, 0);
}
void TIMER2_IRQHandler(nrf_timer_event_t event_type, void *p_context) {
[Copy data from DMA buffer to buffer...]
k_sem_give(&dataReadyToBeSend);
}
static void Thread_DataStreaming(void) {
[..]
while (1) {
semRet = k_sem_take(&dataReadyToBeSend, K_MSEC(1000));
err = bt_gatt_notify(NULL, &myAttr, data, len);
}
}
Log:
[..]
[00:00:04.257,873] <dbg> bt.connected: Update 2MBit PHY
[00:00:04.257,934] <inf> bt: Setting the Connection Parameters: Interval min: 7ms, Interval max: 4000ms, Latency: 0, timout: 5000ms...
[..]
[00:00:04.906,433] <dbg> bt.le_phy_updated: PHY updated!
[00:00:04.906,463] <dbg> bt.showConnectionInfo: Connection Information:
[00:00:04.906,463] <dbg> bt.showConnectionInfo: - Role: Slave
[00:00:04.906,463] <dbg> bt.showConnectionInfo: - Connection Type: LE
[00:00:04.906,494] <dbg> bt.showConnectionInfo: PHY: TX: LE_PHY_2M, RX: LE_PHY_2M
[00:00:04.906,524] <dbg> bt.showDataLengthInfo: Data Length: TX Max length: 251, TX Max time: 1064us, RX Max length: 251, RX Max time: 1064us
[..]
[00:00:56.955,780] <inf> bt_service: 13212 data notif. sent ( 627 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -56 dBm
[00:00:57.172,241] <inf> bt_service: 13284 data notif. sent ( 622 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -52 dBm
[00:00:57.389,099] <inf> bt_service: 13356 data notif. sent ( 625 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -59 dBm
[00:00:57.605,255] <inf> bt_service: 13428 data notif. sent ( 622 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -53 dBm
[00:00:57.821,899] <inf> bt_service: 13500 data notif. sent ( 625 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -54 dBm
[00:00:58.039,550] <inf> bt_service: 13573 data notif. sent ( 627 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -53 dBm
[00:00:58.255,859] <inf> bt_service: 13645 data notif. sent ( 625 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -57 dBm
[00:00:58.473,419] <inf> bt_service: 13718 data notif. sent ( 622 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -69 dBm
[00:00:58.690,795] <inf> bt_service: 13790 data notif. sent ( 627 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -53 dBm
[00:00:58.907,226] <inf> bt_service: 13862 data notif. sent ( 622 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -58 dBm
[00:00:59.124,572] <inf> bt_service: 13935 data notif. sent ( 630 kbit/s), Meas Data Buffer Level: 0.0% ( 0 samples in buffer), TX RSSI: -63 dBm
[00:01:00.192,047] <wrn> ringbuffer: Buffer full, overwriting oldest sample! (w: 1441, r: 1470, in: 433440, out: 424470)
[00:01:00.192,047] <wrn> ringbuffer: Buffer full, overwriting oldest sample! (w: 1442, r: 1471, in: 433441, out: 424470)
ASSERTION FAIL [err == 0] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/hci_core.c:311
k_sem_take failed with err -11
[00:01:09.351,501] <err> os: r0/a1: 0x00000003 r1/a2: 0x00000001 r2/a3: 0x40002000
*** Booting Zephyr OS build zephyr-v2.5.0-2746-g1cc73074d054 *** // <= WATCHDOG RESET
What is the best way to trace this further down?
prj.conf (sanitized):
### General Kernel Configuration
CONFIG_LOG=y
CONFIG_UART_CONSOLE=y
CONFIG_ASSERT=y
## Logging (used in the defines)
CONFIG_LOG_STRDUP_BUF_COUNT=50
CONFIG_LOG_BUFFER_SIZE=65536
CONFIG_LOG_STRDUP_MAX_STRING=100
CONFIG_NEWLIB_LIBC=y # Needed for logging only
CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=y # Needed for logging only
## Optimization
CONFIG_SPEED_OPTIMIZATIONS=y
CONFIG_BT_CTLR_OPTIMIZE_FOR_SPEED=y
## Bluetooth Low Energy
# Increased stack due to settings API usage
CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=2048
# Bluetooth
CONFIG_BT=y
CONFIG_BT_DEBUG_LOG=y
CONFIG_BT_SMP=y
CONFIG_BT_SIGNING=y
CONFIG_BT_PERIPHERAL=y
CONFIG_BT_DIS=y
CONFIG_BT_ATT_PREPARE_COUNT=5
CONFIG_BT_BAS=y
CONFIG_BT_PRIVACY=y
CONFIG_BT_DEVICE_APPEARANCE=833
CONFIG_BT_DEVICE_NAME_DYNAMIC=y
CONFIG_BT_DEVICE_NAME_MAX=65
CONFIG_BT_KEYS_OVERWRITE_OLDEST=y # Overwrite oldest keys with new ones if key storage is full
# Below is setup to be able to update DIS at runtime
# There is still a warning "bt_hci_core: Unable to store name" of which I could not get rid off.
# See also https://github.com/zephyrproject-rtos/zephyr/issues/22478
CONFIG_BT_SETTINGS=y # Needed for settings_runtime_set()
CONFIG_SETTINGS=y # Needed for settings_runtime_set()
CONFIG_SETTINGS_RUNTIME=y # Needed for settings_runtime_set()
CONFIG_SETTINGS_NONE=y
CONFIG_BT_GATT_CACHING=n # To avoid error "bt_gatt: Failed to save Database Hash"
# BLE Throughput Maximization
# Adapted from https://github.com/nrfconnect/sdk-nrf/blob/master/samples/bluetooth/throughput/prj.conf
# Not all parameters might be needed/usefull!
CONFIG_HEAP_MEM_POOL_SIZE=2048
CONFIG_BT_RX_BUF_LEN=258
CONFIG_BT_GATT_CLIENT=y
CONFIG_BT_ATT_PREPARE_COUNT=2
CONFIG_BT_CONN_TX_MAX=255
CONFIG_BT_L2CAP_TX_BUF_COUNT=10
CONFIG_BT_L2CAP_TX_MTU=247
CONFIG_BT_L2CAP_RX_MTU=247
CONFIG_BT_L2CAP_DYNAMIC_CHANNEL=y
CONFIG_BT_CTLR_PHY_2M=y
CONFIG_BT_CTLR_TX_BUFFERS=10
CONFIG_BT_CTLR_TX_BUFFER_SIZE=251
CONFIG_BT_USER_PHY_UPDATE=y
CONFIG_BT_USER_DATA_LEN_UPDATE=y
CONFIG_BT_CTLR_DATA_LENGTH_MAX=251
CONFIG_BT_CTLR_ADVANCED_FEATURES=y
CONFIG_BT_CTLR_XTAL_THRESHOLD=500000
CONFIG_BT_PHY_UPDATE=y
CONFIG_BT_DATA_LEN_UPDATE=y
CONFIG_BT_CTLR_CONN_RSSI=y
CONFIG_BT_CTLR_TX_PWR_DYNAMIC_CONTROL=y
# This seems to fix the following assertion:
# ASSERTION FAIL [!radio_is_ready()] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/nordic/lll/lll_conn.c:317
# See https://github.com/zephyrproject-rtos/zephyr/issues/29994#issuecomment-739623736
CONFIG_ZERO_LATENCY_IRQS=y
CONFIG_BT_CTLR_ZLI=y
We also played with the following parameters but did not see an improvement:
CONFIG_BT_LLL_VENDOR_NORDIC=y
CONFIG_BT_CTLR_LLL_PRIO=0
CONFIG_BT_CTLR_ULL_HIGH_PRIO=0
CONFIG_BT_CTLR_ULL_LOW_PRIO=0
and changed TIMER_IRQ_PRIORITY to 3
Environment :
- OS: Ubuntu 20.04,
- Toolchain: zephyr-sdk-0.12.2
- Zephyr Project (Tag
2.5.0) but also tested with 1cc73074d0540caa8dd7ea9501afefea2e410791
Reference to code: https://github.com/zephyrproject-rtos/zephyr/blob/fe7c2efca800a0cf1bccf23aefe08b3c4beb88bf/subsys/bluetooth/host/hci_core.c#L333
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 44 (38 by maintainers)
Just a little update: I am now back @ work and migrated to
2.6.0. As expected the issue still is the same.I will now try to build a minimal example to reproduce the issue.