zephyr: BLE HID sample fails to reconnect on Windows 10 tablets - Wrong Sequence Number (follow-up)

Same issue as https://github.com/zephyrproject-rtos/zephyr/issues/12362 but as the other one has been closed and is quite long and I have also managed to narrow down the issue. I am re-opening this new one.

Even with @cvinayak 's fix #13983 the issue still exists.

To Reproduce

  1. Build and flash peripheral_hids on Nordic nRF52840 dev kit (PCA10056).
  2. Pair the Windows 10 tablet to the device (all good!)
  3. Reset the Nordic dev kit by pressing ‘BOOT/RESET’

Expected behavior Windows 10 should automatically reconnect to the dev kit as it advertises on restart

  • Observed result: BT_HCI_ERR_TERM_DUE_TO_MIC_FAIL errors on the first packets

Environment (please complete the following information):

  • Latest Zephyr: b17d9ee
  • @cvinayak 's fix: #13983
  • Test: zephyr/samples/bluetooth/peripheral_hids/ on nrf52840_pca10056 using Windows 10 Surface Pro 4.

The issue seems to be in the Link Layer and it is a dynamic issue. That could explain why I thought @cvinayak’s fix fixed my issue: https://github.com/zephyrproject-rtos/zephyr/pull/13983#issuecomment-468639617 I am not familiar enough with the Link Layer to says what is the issue. But here is what I observed with the same binary on a failing case and successful case:

Failing case

See test7-fail-2-with-LL_LENGTH_REQ.pcapng for the complete trace from sniffer-traces.zip.

reconnection-fail

Master > Slave
    Data Header: 0x170f
        .... ..11 = LLID: Control PDU (0x3)
        .... .1.. = Next Expected Sequence Number: 1
        .... 1... = Sequence Number: 1 [OK]
        ...0 .... = More Data: False
        000. .... = RFU: 0
        Length: 23
    Control Opcode: LL_ENC_REQ (0x03)

Slave > Master
    Data Header: 0x091b
        .... ..11 = LLID: Control PDU (0x3)
        .... .0.. = Next Expected Sequence Number: 0
        .... 1... = Sequence Number: 1 [OK]
        ...1 .... = More Data: True
        000. .... = RFU: 0
        Length: 9
    Control Opcode: LL_LENGTH_REQ (0x14)

Master > Slave
    Data Header: 0x020f
        .... ..11 = LLID: Control PDU (0x3)
        .... .1.. = Next Expected Sequence Number: 1
        .... 1... = Sequence Number: 1 [OK]
        ...0 .... = More Data: False
        000. .... = RFU: 0
        Length: 2
    Control Opcode: LL_UNKNOWN_RSP (0x07)
    Unknown Type: LL_LENGTH_REQ (0x14)

Master > Slave
    Data Header: 0x020f
        .... ..11 = LLID: Control PDU (0x3)
        .... .1.. = Next Expected Sequence Number: 1
        .... 1... = Sequence Number: 1 [Wrong]
            [Expert Info (Note/Sequence): Retransmission]
                [Retransmission]
                [Severity level: Note]
                [Group: Sequence]
        ...0 .... = More Data: False
        000. .... = RFU: 0
        Length: 2
    Control Opcode: LL_UNKNOWN_RSP (0x07)
    Unknown Type: LL_LENGTH_REQ (0x14)

Master > Slave
    Data Header: 0x020f
        .... ..11 = LLID: Control PDU (0x3)
        .... .1.. = Next Expected Sequence Number: 1
        .... 1... = Sequence Number: 1 [Wrong]
            [Expert Info (Note/Sequence): Retransmission]
                [Retransmission]
                [Severity level: Note]
                [Group: Sequence]
        ...0 .... = More Data: False
        000. .... = RFU: 0
        Length: 2
    Control Opcode: LL_UNKNOWN_RSP (0x07)
    Unknown Type: LL_LENGTH_REQ (0x14)

Successful case

See test8-successful-2-with-LL_LENGTH_REQ.pcapng for the complete trace from sniffer-traces.zip.

reconnection-successful

Slave > Master
    Data Header: 0x0917
        .... ..11 = LLID: Control PDU (0x3)
        .... .1.. = Next Expected Sequence Number: 1
        .... 0... = Sequence Number: 0 [OK]
        ...1 .... = More Data: True
        000. .... = RFU: 0
        Length: 9
    Control Opcode: LL_LENGTH_REQ (0x14)

Master > Slave
    Data Header: 0x171f
        .... ..11 = LLID: Control PDU (0x3)
        .... .1.. = Next Expected Sequence Number: 1
        .... 1... = Sequence Number: 1 [OK]
        ...1 .... = More Data: True
        000. .... = RFU: 0
        Length: 23
    Control Opcode: LL_ENC_REQ (0x03)

Master > Slave
    Data Header: 0x0203
        .... ..11 = LLID: Control PDU (0x3)
        .... .0.. = Next Expected Sequence Number: 0
        .... 0... = Sequence Number: 0 [OK]
        ...0 .... = More Data: False
        000. .... = RFU: 0
        Length: 2
    Control Opcode: LL_UNKNOWN_RSP (0x07)
    Unknown Type: LL_LENGTH_REQ (0x14)

Slave > Master
    Data Header: 0x0d1b
        .... ..11 = LLID: Control PDU (0x3)
        .... .0.. = Next Expected Sequence Number: 0
        .... 1... = Sequence Number: 1 [OK]
        ...1 .... = More Data: True
        000. .... = RFU: 0
        Length: 13
    Control Opcode: LL_ENC_RSP (0x04)

More Wireshark Sniffer traces can be found in this archive sniffer-traces.zip. I have also learned how to only save a subset of the Wireshark traces. So no more thousand lines traces 😃 Tests traces are prefixed with testN-successful- when the reconnection succeeded and testN-fail- when reconnection failed. It looks like they all follow the same pattern as my extracted traces above.

A workaround to this issue seems to be to disable all LL commands before LL_ENC_REQ:

# Disable LL_SLAVE_FEATURE_REQ:
CONFIG_BT_CTLR_SLAVE_FEAT_REQ=n

# Disable LL_PHY_REQ
CONFIG_BT_PHY_UPDATE=n

# Disable LL_LENGTH_REQ
CONFIG_BT_DATA_LEN_UPDATE=n

cc: @carlescufi @pdunaj @Qbicz @jhn-nordic @wiminordic @jhedberg @FrankGorgenyi @joerchan @vcgomes cvinayak

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 15 (14 by maintainers)

Commits related to this issue

Most upvoted comments

The issue here is dependent on the exact sequence of control packets between the master and slave.

This sequence is this: M -> S LL_ENC_REQ M <- S LL_LENGTH_REQ/LL_SLAVE_FEATURE_REQ (not really important which)

M -> S LL_UNKNOWN_RSP

The reason why the workaround for #13983 did not work was that the LL has enqueued the LL_ENC_RSP when it receives LL_UNKNOWN_RSP, and it has already started to receive encrypted packet, so we get a MIC error on the packet that was transmitted unencrypted.

I have discussed this with @cvinayak and I’m currently working on a fix.