esp-idf: NimBLE scan stops generating results, then crashes when resumed. (IDFGH-2001)
Environment
- Development Kit: TinyPICO
- Module or chip used: ESP32-PICO-D4
- IDF version (run
git describe --tagsto find it): v4.0-beta1, v4.1-dev - Build System: MicroPython
- Compiler version (run
xtensa-esp32-elf-gcc --versionto find it): (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 5.2.0 - Operating System: Linux
- Power Supply: USB
Problem Description
Using NimBLE, running a ble_gap_disc, the stack eventually stops generating new scan results (i.e. the ble_gap_event_fn passed to ble_gap_disc no longer gets called).
If I stop scanning (i.e. ble_gap_disc_cancel), then start again, it fails with:
ASSERT_PARAM(64 0), in rwble.c at line 235
Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0)
Core 0 register dump:
PC : 0x40087aad PS : 0x00060034 A0 : 0x8008b964 A1 : 0x3ffbe4d0
A2 : 0x00000001 A3 : 0x00000000 A4 : 0x00000000 A5 : 0x60008054
A6 : 0x3ffbdb60 A7 : 0x60008050 A8 : 0x80087aad A9 : 0x3ffbe4b0
A10 : 0x00000004 A11 : 0x00000000 A12 : 0x6000804c A13 : 0xffffffff
A14 : 0x00000000 A15 : 0xfffffffc SAR : 0x00000004 EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x400879e5 LEND : 0x400879ec LCOUNT : 0x00000000
Core 0 was running in ISR context:
EPC1 : 0x401c2722 EPC2 : 0x00000000 EPC3 : 0x00000000 EPC4 : 0x40087aad
ELF file SHA256: 0000000000000000000000000000000000000000000000000000000000000000
Backtrace: 0x40087aaa:0x3ffbe4d0 0x4008b961:0x3ffbe4f0 0x4008c54f:0x3ffbe510 0x40085ea5:0x3ffbe530 0x401c271f:0x3ffbbcb0 0x40152d2f:0x3ffbbcd0 0x40095495:0x3ffbbcf0 0x40093fe9:0x3ffbbd20
Core 1 register dump:
PC : 0x401c2722 PS : 0x00060f34 A0 : 0x80152d32 A1 : 0x3ffbc430
A2 : 0x3ffc289d A3 : 0x00060f20 A4 : 0x800941df A5 : 0x3ffcfee0
A6 : 0x00000003 A7 : 0x00060023 A8 : 0x8010f168 A9 : 0x3ffbc420
A10 : 0x00000003 A11 : 0x00060f23 A12 : 0x00060f20 A13 : 0x3ffcfec0
A14 : 0x00000003 A15 : 0x00060023 SAR : 0x0000001d EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x00000000 LEND : 0x00000000 LCOUNT : 0x00000000
ELF file SHA256: 0000000000000000000000000000000000000000000000000000000000000000
Backtrace: 0x401c271f:0x3ffbc430 0x40152d2f:0x3ffbc450 0x40095495:0x3ffbc470 0x40093fe9:0x3ffbc4a0
I have been investigating this issue that was reported by MicroPython users. The typical time to failure was several hours, but I was able to repro it much faster with some devices sending advertisements at a high rate (50 per second). Usually takes about 10-15 mins for the issue to occur.
The most important detail about my configuration is that it’s running inside MicroPython, but I can repro this if I turn gap_scan_cb into a no-op, so that should hopefully exclude any issues with what we’re doing in the callback. No other BLE operations were active. It’s a passive scan, with interval=30ms, window=30ms, duration=forever. Another detail is that NimBLE is pinned to core 1 (same core as the MicroPython task), but I saw this problem when it was on the default core (0). The main MicroPython task is blocked on the UART (for the REPL) via ulTaskNotifyTake.
MicroPython doesn’t use IDF’s makefiles or cmake, but it does use sdkconfig. I’ve attached the input and generated sdkconfig.h below.
I wonder if this might be related to https://github.com/espressif/esp-idf/issues/4001 (but that was on Bluedroid, not NimBLE). I can repro with both v4.0-beta1 and v4.1-dev.
Here’s a disassembly of r_rwble_isr which seems to be raising the assert above https://gist.github.com/jimmo/43ba1da440fcbebfab89c70e345f368b (I’ve marked the relevant line with ********) I’ve attached the elf below. Built from https://github.com/jimmo/micropython/tree/ble-fixes
Other items if possible
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 34 (26 by maintainers)
Commits related to this issue
- esp32: Update IDF v4.0 supported hash. This is a "re-release" of v4.0. The "v4.0" tag was updated to include some backported fixes. The main one is https://github.com/espressif/esp-idf/issues/4196 — committed to jimmo/micropython by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash. This is a "re-release" of v4.0. The "v4.0" tag was updated to include some backported fixes. The main fix relevant to MicroPython is https://github.com/espress... — committed to jimmo/micropython by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to jimmo/micropython by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to jimmo/micropython by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
- esp32: Update IDF v4.0 supported hash to v4.0.1. The main fix relevant to MicroPython is https://github.com/espressif/esp-idf/issues/4196 Release notes here https://github.com/espressif/esp-idf/rele... — committed to pfalcon/pycopy by jimmo 4 years ago
Hi @jimmo, apologies for late reply. FYI, I was able to reproduce the issue with high advertisement rate. Though I am still looking into the root-cause, can you try disabling
Bluetooth Modem Sleep? From menuconfig just follow this path :Component config > Bluetooth > Bluetooth controller > MODEM SLEEP Options. Alternatively you can comment outCONFIG_BTDM_MODEM_SLEEPandCONFIG_BTDM_MODEM_SLEEP_MODE_ORIGfrom your sdkconfig.Please share your valuable observations.
Hi @prasad-alatkar, we also observe this issue in 4.0-rc1. We urgently need a workaround for this. Can you give some more details on how to avoid this issue or how we could detect and recover from it?
Hi @jimmo,
That’s right, we don’t move the
v4.0release tag. When @prasad-alatkar talks about “backported to v4.0”, he means that the fix was merged into therelease/v4.0branch and will be part of thev4.0.1bugfix release once that’s available.We use these “release branches” to maintain bugfix updates for each major & minor version, so in this case
release/v4.0is the branch that contains the v4.0 tag, and future commits on this branch will be tagged for v4.0.1, v4.0.2, etc.This page in the ESP-IDF docs has more details about our versioning scheme: https://docs.espressif.com/projects/esp-idf/en/stable/versions.html
At this stage, if you plan to stick with stable releases then the only option is to wait for the v4.0.1 release. However if you want to check out the latest commit on the release/v4.0 branch then you can test this fix before it’s released.
Hi @jimmo @0neblock apologies for delayed response. @csushantk is debugging this issue internally. I will update you once the issue is fixed.
@prasad-alatkar Thanks for looking into this!
We set
filter_duplicatesto0.We call cancel from the main MicroPython task.
To elaborate – the MicroPython firmware gives me a REPL prompt on the device, from which I call a Python function which internally calls
ble_gap_disc, with interval and window both set to 30ms.When the disc callback is invoked, it copies the data into a ringbuffer and notifies the MicroPython task, which then schedules a Python callback that prints out the scan data. (When I referred to making the callback a no-op, this is the code I’m talking about)
After some time interval (usually 10-15 mins in my testing at ~50 results/s, but I’ve had reports from users that this can take many hours at a slower rate), I stop seeing the print output from my Python callback.
So that’s where I interactively stop the scan from the REPL, then re-start it, at which point I see the crash described in rwble.c.
The main MicroPython task is created from app_main().
MP_TASK_COREIDis1. We also pin NimBLE to core 1.