connectedhomeip: [BUG] Thread commissioning fails during FindOperational - Can't ping the device

Reproduction steps

  • (Re-)Create a new thread network
  • Get the thread dataset
  • Commission a Matter device using qrcode thread with chiptool using the previous dataset

During the commissioning, when you are at the step “Find Operational”, if you see that it takes “anormal time” to finish, try to ping the resolved IPv6 from the computer running otbr-agent.

If you can’t ping the device and the pairing failed, then you are on the problem

It appear 1/10 times, it’s really hard to debug because when the pairing fails (failsafe timer expiration) the device is disconnected from Thread network.

Also note that when the problem happens, the command ot-ctl childip does not show any IP for the thread device, but the thread device correctly appear when you execute the commande ot-ctl child table

Bug prevalence

1/10

GitHub hash of the SDK that was being used

v1.0.0

Platform

android, nrf, raspi

Platform Version(s)

No response

Anything else?

Logs of the commissioner (here it’s android chip-tool modified):

03-07 16:19:25.439 11506 11531 D CTL     : Performing next commissioning step 'FindOperational'
03-07 16:19:25.439 11506 11531 D CSM     : FindOrEstablishSession: PeerId = [1:000000000000002D]
03-07 16:19:25.439 11506 11531 D CSM     : FindOrEstablishSession: No existing OperationalSessionSetup instance found
03-07 16:19:25.440 11506 11531 D CTL     : OperationalSessionSetup[1:000000000000002D]: State change 1 --> 2
03-07 16:19:25.440 11506 11531 D DIS     : Resolving 5BB2BF7451309CA6:000000000000002D ...
03-07 16:19:25.469 11506 11531 D NsdManagerServiceResolver: resolve: Starting service resolution for '5BB2BF7451309CA6-000000000000002D' type '_matter._tcp'
03-07 16:19:25.470 11506 11531 D DMG     : ICR moving to [AwaitingDe]
03-07 16:19:25.640 11506 12147 D DIS     : Checking node lookup status after 200 ms
03-07 16:19:27.938 11506 12147 D DL      : Received SendWriteRequest
03-07 16:19:28.004 11506 11531 D chip.BluetoothManager: MatterLight.onCharacteristicWrite: 18ee2ef5-263d-4559-959f-4f9c429f9d11 -> 0
03-07 16:19:28.448 11506 12145 I NsdManagerServiceResolver: Resolved service '5BB2BF7451309CA6-000000000000002D' to /fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9
03-07 16:19:28.479 11506 12145 D DIS     :  ----- entry [0] : T 0
03-07 16:19:28.479 11506 12145 D DIS     :  ----- entry [1] : SAI 300
03-07 16:19:28.479 11506 12145 D DIS     :  ----- entry [2] : SII 5000
03-07 16:19:28.479 11506 12145 D DIS     : Node ID resolved for 5BB2BF7451309CA6:000000000000002D
03-07 16:19:28.479 11506 12145 D DIS     : 	Hostname: fdaa:bbcc:ddee:0
03-07 16:19:28.479 11506 12145 D DIS     : 	IP Address #1: fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9
03-07 16:19:28.479 11506 12145 D DIS     : 	Port: 5540
03-07 16:19:28.479 11506 12145 D DIS     : 	Mrp Interval idle: 5000 ms
03-07 16:19:28.479 11506 12145 D DIS     : 	Mrp Interval active: 300 ms
03-07 16:19:28.479 11506 12145 D DIS     : 	TCP Supported: 0
03-07 16:19:28.480 11506 12145 D DIS     : Lookup clearing interface for non LL address
03-07 16:19:28.480 11506 12145 D DIS     : UDP:[fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9]:5540: new best score: 4
03-07 16:19:28.480 11506 12145 D DIS     : Checking node lookup status after 3041 ms
03-07 16:19:28.481 11506 12145 D DIS     : OperationalSessionSetup[1:000000000000002D]: Updating device address to UDP:[fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9]:5540 while in state 2
03-07 16:19:28.481 11506 12145 D CTL     : OperationalSessionSetup[1:000000000000002D]: State change 2 --> 3
03-07 16:19:28.481 11506 12145 D IN      : SecureSession[0x7b383fd368]: Allocated Type:2 LSID:52281
03-07 16:19:28.481 11506 12145 D SC      : Initiating session on local FabricIndex 1 from 0x000000000001B669 -> 0x000000000000002D
03-07 16:19:28.499 11506 12145 D TOO     : KVS: Getting key f/1/g
03-07 16:19:28.499 11506 12145 D TOO     : KVS: Getting key f/1/k/0
03-07 16:19:28.500 11506 12145 D TOO     : KVS: Getting key f/1/s/000000000000002D
03-07 16:19:28.500 11506 12145 D KeyValueStoreManager: Key 'f/1/s/000000000000002D' not found in shared preferences
03-07 16:19:28.500 11506 12145 D EM      : <<< [E:5624i M:33095801] (U) Msg TX to 0:0000000000000000 [0000] --- Type 0000:30 (SecureChannel:CASE_Sigma1)
03-07 16:19:28.500 11506 12145 D IN      : (U) Sending msg 33095801 to IP address 'UDP:[fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9]:5540'
03-07 16:19:28.501 11506 12145 D SC      : Sent Sigma1 msg
03-07 16:19:28.501 11506 12145 D CTL     : OperationalSessionSetup[1:000000000000002D]: State change 3 --> 4
03-07 16:19:30.478 11506 11531 D chip.BluetoothManager: MatterLight.onCharacteristicChanged: 18ee2ef5-263d-4559-959f-4f9c429f9d12
03-07 16:19:32.981 11506 12147 D DL      : Received SendWriteRequest
03-07 16:19:33.086 11506 11531 D chip.BluetoothManager: MatterLight.onCharacteristicWrite: 18ee2ef5-263d-4559-959f-4f9c429f9d11 -> 0
03-07 16:19:35.021 11506 12147 D EM      : Retransmitting MessageCounter:33095801 on exchange 5624i Send Cnt 1
03-07 16:19:35.022 11506 12147 D IN      : (U) Sending msg 33095801 to IP address 'UDP:[fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9]:5540'
03-07 16:19:35.564 11506 11531 D chip.BluetoothManager: MatterLight.onCharacteristicChanged: 18ee2ef5-263d-4559-959f-4f9c429f9d12
03-07 16:19:38.066 11506 12147 D DL      : Received SendWriteRequest
03-07 16:19:38.168 11506 11531 D chip.BluetoothManager: MatterLight.onCharacteristicWrite: 18ee2ef5-263d-4559-959f-4f9c429f9d11 -> 0
03-07 16:19:40.644 11506 11531 D chip.BluetoothManager: MatterLight.onCharacteristicChanged: 18ee2ef5-263d-4559-959f-4f9c429f9d12
03-07 16:19:41.895 11506 12147 D EM      : Retransmitting MessageCounter:33095801 on exchange 5624i Send Cnt 2
03-07 16:19:41.896 11506 12147 D IN      : (U) Sending msg 33095801 to IP address 'UDP:[fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9]:5540'
03-07 16:19:43.146 11506 12147 D DL      : Received SendWriteRequest
03-07 16:19:43.250 11506 11532 D chip.BluetoothManager: MatterLight.onCharacteristicWrite: 18ee2ef5-263d-4559-959f-4f9c429f9d11 -> 0
03-07 16:19:45.730 11506 11532 D chip.BluetoothManager: MatterLight.onCharacteristicChanged: 18ee2ef5-263d-4559-959f-4f9c429f9d12
03-07 16:19:48.232 11506 12147 D DL      : Received SendWriteRequest
03-07 16:19:48.292 11506 11532 D chip.BluetoothManager: MatterLight.onCharacteristicWrite: 18ee2ef5-263d-4559-959f-4f9c429f9d11 -> 0
03-07 16:19:50.768 11506 11532 D chip.BluetoothManager: MatterLight.onCharacteristicChanged: 18ee2ef5-263d-4559-959f-4f9c429f9d12
03-07 16:19:51.068 11506 12147 D EM      : Retransmitting MessageCounter:33095801 on exchange 5624i Send Cnt 3
03-07 16:19:51.068 11506 12147 D IN      : (U) Sending msg 33095801 to IP address 'UDP:[fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9]:5540'
03-07 16:19:51.714 11506 11532 D BluetoothGatt: onClientConnectionState() - status=19 clientIf=7 device=D4:44:06:2E:2E:A2
03-07 16:19:51.731 11506 11532 I chip.BluetoothManager: MatterLight.onConnectionStateChange status = 19, newState=0
03-07 16:19:51.732 11506 11532 D AndroidBleManager: onConnectionStateChange Disconnected
03-07 16:19:51.734 11506 11532 D IN      : Clearing BLE pending packets.
03-07 16:19:51.735 11506 11532 I chip.BluetoothManager|onConnectionStateChange: STATE_DISCONNECTED
03-07 16:20:05.642 11506 12147 D EM      : Retransmitting MessageCounter:33095801 on exchange 5624i Send Cnt 4
03-07 16:20:05.643 11506 12147 D IN      : (U) Sending msg 33095801 to IP address 'UDP:[fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9]:5540'
03-07 16:20:29.224 11506 12147 E EM      : Failed to Send CHIP MessageCounter:33095801 on exchange 5624i sendCount: 4 max retries: 4
03-07 16:20:40.991 11506 12147 E SC      : CASESession timed out while waiting for a response from the peer. Current state was 1
03-07 16:20:40.991 11506 12147 D IN      : SecureSession[0x7b383fd368]: Released - Type:2 LSID:52281
03-07 16:20:41.020 11506 12147 D CTL     : Device connection failed. Error ../../src/protocols/secure_channel/CASESession.cpp:295: CHIP Error 0x00000032: Timeout
03-07 16:20:41.088 11506 12147 I MTMethods: sendEvent
03-07 16:20:41.092 11506 12147 I MTMethods: { NativeMap: {"data":"{\"nodeId\":\"45\",\"stage\":\"FindOperational\"}","code":"50","codeDescription":"SUCCESS","from":"onCommissioningStatusUpdate"} }
03-07 16:20:41.095 11506 12147 D CTL     : Error on commissioning step 'FindOperational': '../../src/protocols/secure_channel/CASESession.cpp:295: CHIP Error 0x00000032: Timeout'
03-07 16:20:41.095 11506 12147 E CTL     : Failed to perform commissioning step 18
03-07 16:20:41.095 11506 12147 D CTL     : Going from commissioning step 'FindOperational' with lastErr = '../../src/protocols/secure_channel/CASESession.cpp:295: CHIP Error 0x00000032: Timeout' -> 'Cleanup'
03-07 16:20:41.096 11506 12147 D CTL     : Performing next commissioning step 'Cleanup' with completion status = '../../src/protocols/secure_channel/CASESession.cpp:295: CHIP Error 0x00000032: Timeout'
03-07 16:20:41.097 11506 12147 I MTMethods: sendEvent
03-07 16:20:41.097 11506 12147 I MTMethods: { NativeMap: {"data":"{\"nodeId\":\"45\",\"stage\":\"Cleanup\"}","code":"0","codeDescription":"SUCCESS","from":"onCommissioningStatusUpdate"} }
03-07 16:20:41.098 11506 12147 D CTL     : Successfully finished commissioning step 'Cleanup'
03-07 16:20:41.101 11506 12147 I MTMethods: sendEvent
03-07 16:20:41.101 11506 12147 I MTMethods: { NativeMap: {"data":"","code":"50","codeDescription":"FAILURE","from":"onCommissioningComplete"} }

Logs of the device (nRF52840 running the lighting app example):

[00:00:31.024,108] <inf> chip: [ZCL]OpCreds: Received an AddNOC command
[00:00:31.028,533] <inf> chip: [FP]Validating NOC chain
[00:00:31.087,188] <inf> chip: [FP]NOC chain validation successful
[00:00:31.087,341] <inf> chip: [FP]Added new fabric at index: 0x1
[00:00:31.087,585] <inf> chip: [FP]Assigned compressed fabric ID: 0x5BB2BF7451309CA6, node ID: 0x000000000000002D
[00:00:31.087,768] <inf> chip: [TS]Last Known Good Time: 2023-03-07T14:02:11
[00:00:31.087,951] <inf> chip: [TS]New proposed Last Known Good Time: 2021-06-10T00:00:00
[00:00:31.088,043] <inf> chip: [TS]Retaining current Last Known Good Time
[00:00:31.088,226] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_003E update version to 1a7d840
[00:00:31.088,439] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_003E update version to 1a7d841
[00:00:31.122,528] <dbg> chip: LogV: [EVL]LogEvent event number: 0x0000000000000002 priority: 1, endpoint id:  0x0 cluster id: 0x0000_001F event id: 0x0 Sys timestamp: 0x0000000000007991
[00:00:31.122,711] <inf> chip: [ZCL]OpCreds: ACL entry created for Fabric index 0x1 CASE Admin Subject 0x000000000001B669
[00:00:31.122,802] <dbg> chip: LogV: [DL]Using Thread extended MAC for hostname.
[00:00:31.122,985] <inf> chip: [DIS]Advertise operational node 5BB2BF7451309CA6-000000000000002D
[00:00:31.123,260] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_003E update version to 1a7d842
[00:00:31.123,474] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_003E update version to 1a7d843
[00:00:31.123,596] <dbg> chip: LogV: [DMG]Command handler moving to [ Preparing]
[00:00:31.123,718] <dbg> chip: LogV: [DMG]Command handler moving to [AddingComm]
[00:00:31.123,840] <dbg> chip: LogV: [DMG]Command handler moving to [AddedComma]
[00:00:31.123,931] <inf> chip: [ZCL]OpCreds: successfully created fabric index 0x1 via AddNOC
[00:00:31.124,267] <dbg> chip: LogV: [DMG]Decreasing reference count for CommandHandler, remaining 0
[00:00:31.125,213] <inf> chip: [EM]<<< [E:5621r M:116813147] (S) Msg TX to 1:FFFFFFFB00000000 [9CA6] --- Type 0001:09 (IM:InvokeCommandResponse)
[00:00:31.125,366] <inf> chip: [IN](S) Sending msg 116813147 on secure session with LSID: 41506
[00:00:31.125,549] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 71)
[00:00:31.125,915] <dbg> chip: LogV: [DMG]Command handler moving to [CommandSen]
[00:00:31.126,007] <dbg> chip: LogV: [DMG]Command handler moving to [AwaitingDe]
[00:00:31.126,159] <err> chip: [DL]Long dispatch time: 106 ms, for event type 16388
[00:00:31.200,286] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:31.201,446] <dbg> chip: LogV: [DL]Write request received for CHIPoBLE RX characteristic (ConnId 0x00)
[00:00:31.203,033] <inf> chip: [EM]>>> [E:5622r M:6963398] (S) Msg RX from 1:FFFFFFFB00000000 [9CA6] --- Type 0001:08 (IM:InvokeCommandRequest)
[00:00:31.203,186] <dbg> chip: LogV: [EM]Handling via exchange: 5622r, Delegate: 0x20010c44
[00:00:31.203,857] <dbg> chip: LogV: [DMG]Received command for Endpoint=0 Cluster=0x0000_0031 Command=0x0000_0003
[00:00:31.210,113] <dbg> chip: LogV: [DMG]Command handler moving to [ Preparing]
[00:00:31.210,205] <dbg> chip: LogV: [DMG]Command handler moving to [AddingComm]
[00:00:31.210,357] <dbg> chip: LogV: [DMG]Command handler moving to [AddedComma]
[00:00:31.210,571] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_0030 update version to ac6117b6
[00:00:31.210,876] <dbg> chip: LogV: [DMG]Decreasing reference count for CommandHandler, remaining 0
[00:00:31.211,791] <inf> chip: [EM]<<< [E:5622r M:116813148] (S) Msg TX to 1:FFFFFFFB00000000 [9CA6] --- Type 0001:09 (IM:InvokeCommandResponse)
[00:00:31.211,975] <inf> chip: [IN](S) Sending msg 116813148 on secure session with LSID: 41506
[00:00:31.212,158] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 71)
[00:00:31.212,524] <dbg> chip: LogV: [DMG]Command handler moving to [CommandSen]
[00:00:31.212,615] <dbg> chip: LogV: [DMG]Command handler moving to [AwaitingDe]
[00:00:31.291,168] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:31.291,320] <dbg> chip: LogV: [DL]Write request received for CHIPoBLE RX characteristic (ConnId 0x00)
[00:00:31.292,449] <inf> chip: [EM]>>> [E:5623r M:6963399] (S) Msg RX from 1:FFFFFFFB00000000 [9CA6] --- Type 0001:08 (IM:InvokeCommandRequest)
[00:00:31.292,633] <dbg> chip: LogV: [EM]Handling via exchange: 5623r, Delegate: 0x20010c44
[00:00:31.293,304] <dbg> chip: LogV: [DMG]Received command for Endpoint=0 Cluster=0x0000_0031 Command=0x0000_0006
[00:00:31.309,478] <dbg> chip: LogV: [DMG]Decreasing reference count for CommandHandler, remaining 1
[00:00:31.309,661] <dbg> chip: LogV: [DL]OpenThread State Changed (Flags: 0x101fc100)
[00:00:31.309,722] <dbg> chip: LogV: [DL]   Network Name: OpenThread-c37e
[00:00:31.309,814] <dbg> chip: LogV: [DL]   PAN Id: 0xC37E
[00:00:31.309,997] <dbg> chip: LogV: [DL]   Extended PAN Id: 0x1B1EF3DC76FCCEF8
[00:00:31.310,089] <dbg> chip: LogV: [DL]   Channel: 11
[00:00:31.310,211] <dbg> chip: LogV: [DL]   Mesh Prefix: fd0d:ef30:fec0:af6a::/64
[00:00:31.312,774] <dbg> chip: LogV: [DL]OpenThread State Changed (Flags: 0x1100101d)
[00:00:31.312,835] <dbg> chip: LogV: [DL]   Device Role: DETACHED
[00:00:31.312,896] <dbg> chip: LogV: [DL]   Thread Unicast Addresses:
[00:00:31.313,079] <dbg> chip: LogV: [DL]        fd0d:ef30:fec0:af6a:ad47:e9a9:eefa:e0a7/64 valid
[00:00:31.313,232] <dbg> chip: LogV: [DL]        fe80::30c0:9410:3f56:dc97/64 valid preferred
[00:00:32.542,724] <inf> chip: [DL]SRP Client was started, detected server: fd0d:ef30:fec0:af6a:8469:8034:1d82:5e7c
[00:00:32.544,433] <dbg> chip: LogV: [DL]OpenThread State Changed (Flags: 0x200012a4)
[00:00:32.544,525] <dbg> chip: LogV: [DL]   Device Role: CHILD
[00:00:32.544,616] <dbg> chip: LogV: [DL]   Partition Id: 0x40358B7D
[00:00:32.547,180] <dbg> chip: LogV: [DL]OpenThread State Changed (Flags: 0x00000001)
[00:00:32.547,241] <dbg> chip: LogV: [DL]   Thread Unicast Addresses:
[00:00:32.547,393] <dbg> chip: LogV: [DL]        fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9/64 valid preferred
[00:00:32.547,576] <dbg> chip: LogV: [DL]        fd0d:ef30:fec0:af6a:0:ff:fe00:e801/64 valid rloc
[00:00:32.547,760] <dbg> chip: LogV: [DL]        fd0d:ef30:fec0:af6a:ad47:e9a9:eefa:e0a7/64 valid
[00:00:32.547,912] <dbg> chip: LogV: [DL]        fe80::30c0:9410:3f56:dc97/64 valid preferred
[00:00:32.550,994] <dbg> chip: LogV: [DMG]Command handler moving to [ Preparing]
[00:00:32.551,116] <dbg> chip: LogV: [DMG]Command handler moving to [AddingComm]
[00:00:32.551,239] <dbg> chip: LogV: [DMG]Command handler moving to [AddedComma]
[00:00:32.551,452] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_0030 update version to ac6117b7
[00:00:32.551,605] <dbg> chip: LogV: [DMG]Decreasing reference count for CommandHandler, remaining 0
[00:00:32.723,266] <inf> chip: [EM]<<< [E:5623r M:116813149] (S) Msg TX to 1:FFFFFFFB00000000 [9CA6] --- Type 0001:09 (IM:InvokeCommandResponse)
[00:00:32.723,449] <inf> chip: [IN](S) Sending msg 116813149 on secure session with LSID: 41506
[00:00:32.723,602] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 70)
[00:00:32.723,999] <dbg> chip: LogV: [DMG]Command handler moving to [CommandSen]
[00:00:32.724,090] <dbg> chip: LogV: [DMG]Command handler moving to [AwaitingDe]
[00:00:32.724,182] <err> chip: [DL]Long dispatch time: 174 ms, for event type 3
[00:00:32.724,304] <dbg> chip: LogV: [DL]Using Thread extended MAC for hostname.
[00:00:32.724,517] <inf> chip: [DIS]Advertise operational node 5BB2BF7451309CA6-000000000000002D
[00:00:32.724,639] <inf> chip: [SVR]Operational advertising enabled
[00:00:32.854,553] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:33.274,200] <dbg> chip: LogV: [DL]SRP update succeeded
[00:00:33.274,291] <inf> chip: [DIS]Setting operational delegate post init
[00:00:33.274,688] <inf> chip: [DIS]Updating services using commissioning mode 0
[00:00:33.274,780] <dbg> chip: LogV: [DL]Using Thread extended MAC for hostname.
[00:00:33.274,993] <inf> chip: [DIS]Advertise operational node 5BB2BF7451309CA6-000000000000002D
[00:00:33.275,482] <inf> chip: [DL]advertising srp service: 5BB2BF7451309CA6-000000000000002D._matter._tcp
[00:00:33.275,756] <dbg> chip: LogV: [DL]Using Thread extended MAC for hostname.
[00:00:33.277,069] <inf> chip: [DIS]Advertise commission parameter vendorID=65521 productID=32773 discriminator=3840/15 cm=0
[00:00:33.277,648] <inf> chip: [DL]advertising srp service: 8CCEED038B98779A._matterc._udp
[00:00:34.519,989] <dbg> chip: LogV: [DL]SRP update succeeded
[00:00:35.250,610] <dbg> chip: LogV: [DL]Write request received for CHIPoBLE RX characteristic (ConnId 0x00)
[00:00:37.751,251] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 3)
[00:00:37.815,643] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:38.833,404] <dbg> chip: LogV: [DL]OpenThread State Changed (Flags: 0x00000064)
[00:00:38.833,465] <dbg> chip: LogV: [DL]   Device Role: ROUTER
[00:00:40.335,784] <dbg> chip: LogV: [DL]Write request received for CHIPoBLE RX characteristic (ConnId 0x00)
[00:00:42.835,449] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 3)
[00:00:42.900,817] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:43.551,757] <dbg> chip: LogV: [DL]OpenThread State Changed (Flags: 0x00000200)
[00:00:45.420,989] <dbg> chip: LogV: [DL]Write request received for CHIPoBLE RX characteristic (ConnId 0x00)
[00:00:47.921,661] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 3)
[00:00:47.986,022] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:50.506,195] <dbg> chip: LogV: [DL]Write request received for CHIPoBLE RX characteristic (ConnId 0x00)
[00:00:53.006,835] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 3)
[00:00:53.071,228] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:55.546,386] <dbg> chip: LogV: [DL]Write request received for CHIPoBLE RX characteristic (ConnId 0x00)
[00:00:58.047,027] <dbg> chip: LogV: [DL]Sending indication for CHIPoBLE TX characteristic (ConnId 00, len 3)
[00:00:58.111,419] <dbg> chip: LogV: [DL]Indication for CHIPoBLE TX characteristic done (ConnId 0x00, result 0x00)
[00:00:58.963,745] <inf> chip: [FS]Fail-safe timer expired
[00:00:58.963,836] <err> chip: [SVR]Failsafe timer expired
[00:00:58.964,019] <dbg> chip: LogV: [IN]SecureSession[0x20002d10]: MarkForEviction Type:1 LSID:41506
[00:00:58.964,172] <inf> chip: [SC]SecureSession[0x20002d10]: Moving from state 'kActive' --> 'kPendingEviction'
[00:00:58.964,324] <dbg> chip: LogV: [IN]SecureSession[0x20002d10]: Released - Type:1 LSID:41506
[00:00:58.964,447] <err> chip: [SVR]Commissioning failed (attempt 1): 32
[00:00:58.964,508] <dbg> chip: LogV: [IN]Clearing BLE pending packets.
[00:00:58.964,599] <inf> chip: [BLE]Releasing end point's BLE connection back to application.
[00:00:58.964,691] <inf> chip: [DL]Closing BLE GATT connection (ConnId 00)
[00:00:58.969,573] <dbg> chip: LogV: [IN]SecureSession[0x20002d10]: Allocated Type:1 LSID:41508
[00:00:58.969,665] <dbg> chip: LogV: [SC]Assigned local session key ID 41508
[00:00:58.969,787] <dbg> chip: LogV: [SC]Waiting for PBKDF param request
[00:00:58.969,879] <dbg> chip: LogV: [DL]CHIPoBLE advertising set to on
[00:00:58.970,001] <inf> chip: [DIS]Updating services using commissioning mode 1
[00:00:58.970,123] <dbg> chip: LogV: [DL]Using Thread extended MAC for hostname.
[00:00:58.970,306] <inf> chip: [DIS]Advertise operational node 5BB2BF7451309CA6-000000000000002D
[00:00:58.970,886] <dbg> chip: LogV: [DL]Using Thread extended MAC for hostname.
[00:00:58.972,198] <inf> chip: [DIS]Advertise commission parameter vendorID=65521 productID=32773 discriminator=3840/15 cm=1
[00:00:58.972,869] <inf> chip: [DL]advertising srp service: 8CCEED038B98779A._matterc._udp
[00:00:58.973,175] <err> chip: [ZCL]OpCreds: Got FailSafeTimerExpired
[00:00:58.973,297] <err> chip: [ZCL]OpCreds: Proceeding to FailSafeCleanup on fail-safe expiry!
[00:00:58.973,358] <dbg> chip: LogV: [IN]Expiring all sessions for fabric 0x1!!
[00:00:58.973,480] <err> chip: [FP]Reverting pending fabric data for fabric 0x1
[00:00:58.973,968] <dbg> chip: LogV: [EVL]LogEvent event number: 0x0000000000000003 priority: 1, endpoint id:  0x0 cluster id: 0x0000_0028 event id: 0x2 Sys timestamp: 0x000000000000E65D
[00:00:58.974,090] <dbg> chip: LogV: [DMG]All ReadHandler-s are clean, clear GlobalDirtySet
[00:00:58.975,311] <err> chip: [FP]Warning: metadata not found during delete of fabric 0x1
[00:00:59.071,594] <inf> chip: [FP]Fabric (0x1) deleted.
[00:00:59.071,685] <inf> chip: [ZCL]UserLabel: Last Fabric index 0x1 was removed
[00:00:59.072,906] <inf> chip: [ZCL]OpCreds: Fabric index 0x1 was removed
[00:00:59.074,096] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_003E update version to 1a7d844
[00:00:59.074,310] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_003E update version to 1a7d845
[00:00:59.214,141] <inf> chip: [TS]Pending Last Known Good Time: 2023-03-07T14:02:11
[00:00:59.215,393] <inf> chip: [TS]Previous Last Known Good Time: 2023-03-07T14:02:11
[00:00:59.215,484] <inf> chip: [TS]Reverted Last Known Good Time to previous value
[00:00:59.215,942] <dbg> chip: LogV: [EVL]LogEvent event number: 0x0000000000000004 priority: 1, endpoint id:  0x0 cluster id: 0x0000_0028 event id: 0x2 Sys timestamp: 0x000000000000E74F
[00:00:59.216,064] <dbg> chip: LogV: [DMG]All ReadHandler-s are clean, clear GlobalDirtySet
[00:00:59.217,254] <err> chip: [FP]Warning: metadata not found during delete of fabric 0x1
[00:00:59.221,740] <err> chip: [ZCL]OpCreds: failed to delete fabric at index 1: d8
[00:00:59.221,954] <dbg> chip: LogV: [DMG]Endpoint 0, Cluster 0x0000_0030 update version to ac6117b8
[00:00:59.222,045] <dbg> chip: LogV: [ZCL]Failsafe timeout, tell platform driver to revert network credentials.
[00:00:59.222,808] <inf> chip: [NP]Reverting Thread operational dataset
[00:00:59.232,421] <err> chip: [DL]Long dispatch time: 269 ms, for event type 32782
[00:00:59.232,543] <inf> chip: [FS]Fail-safe cleanly disarmed
[00:00:59.234,680] <dbg> chip: LogV: [DL]OpenThread State Changed (Flags: 0x11002046)
[00:00:59.234,741] <dbg> chip: LogV: [DL]   Device Role: DISABLED
[00:00:59.234,802] <dbg> chip: LogV: [DL]   Thread Unicast Addresses:
[00:00:59.234,985] <dbg> chip: LogV: [DL]        fdaa:bbcc:ddee:0:819c:31a3:f1df:2ad9/64 valid preferred
[00:00:59.236,297] <inf> chip: [DL]BLE GAP connection terminated (reason 0x16)
[00:00:59.236,419] <inf> chip: [DL]Current number of connections: 0/1
[00:00:59.239,196] <inf> chip: [DL]CHIPoBLE advertising started
[00:00:59.241,516] <inf> chip: [DL]NFC Tag emulation started
[00:01:29.240,020] <inf> chip: [DL]CHIPoBLE advertising mode changed to slow

I am using the stack v1.0.0

related issues: https://github.com/project-chip/connectedhomeip/issues/18766

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 18 (9 by maintainers)

Most upvoted comments

@saramonteiro : I’m not familiar with otbr docker container but i think it add more complication than a normal install.

For the backbone issue, the wlan0 interface is on the raspberry right ? But you run otbr agent in a docker and in the docker i think you don’t have wlan0. But you should have an eth0 interface that is bridged to host interfaces.

You can try to use eth0 instead of wlan0 when you start otbr-agent from docker

Try to do it, if it doesn’t work i will try to run otbr on docker to help

But i think, we need to discuss about it on another issue (probably in openthread repo) because this problem is not related to my issue.

Also check that ip forwarding in ipv6 is enabled on the rpi, i think it can help

Hello @saramonteiro :

Note that this case only happens sometimes, but when it happens it’s when I recreate the thread network, get the dataset and commission a device directly after.

Also when the problem happens, now I’m sure that it is not due to a wrong dataset.

Also, note that “ping failure” can be from many reasons, currently we identify multiple problems:

When the ping fails with timeout:

  1. One possible case is when you re-create the thread networks without using the same prefix. The computers/Phones receive a router-advertisement packet with new prefix. New prefix means new route and new IPv6. Sometimes, when we have multiple routes/prefix, the Android phone use a wrong IPv6 source on the ping, and we get no responses from Thread devices. But when the old IPv6 arrive at expiration (due to expiration of the router advertisement), then the ping works because it start using the good IPv6 as source. We had this problem previously and we solved it by using the same prefix every time we initialise the thread network.
  2. Another possible case: I don’t know and this is what we try to understand in this issue

When the ping fails with “network unreachable” errors:

It means that there is no route to access the device.

The Android phone or computer running chip-tool is supposed to create a route in the routing table to access the thread network when he receive a router-advertisement.

The icmp-router-advertisement can be send by otbr-agent from multiple situations:

  • When the thread network is created
  • Periodically (because a RA have an expiration time, so it must be sent periodically)
  • When otbr-agent receive a imcp-router-solicitation packet (usually when the phone connects to a WiFi networks it automatically send a icmp-rs packet)

Sometimes on Android phones, the router advertisement packets are simply not received or processed. We found many issues about it that suggest it’s due to power saving reasons in WiFi chipsets of android phones.


Note: We use the ping because it helps to understand why the commissioning fails…

In your case you can’t ping the device from outside the container (computer/phone that runs chiptool ?) If yes, what kind of errors did you get when you ping ?

  • Timeout ?
  • Network unreachable/No route to host ?

Also, I don’t have these lines in otbr-agent, because we don’t have ipset feature in our system, so we disable OT_FIREWALL. We manage the firewall from outside.