zephyr: zperf ping timeout on beagleconnect_freedom over subg
Describe the bug Subg performance for Beagle-connect Freedom is extremely unreliable and slow. Initially, I encountered this while implementing support for DNS SD PTR query. None of the messages were recieved if multiple nodes responded simultaneously. However, it worked fine if one of the nodes was delayed.
I created a simple server client to test the reliability and performance, but well, I have no idea what to make of the results.
Most of the times, I get the following:
*** Booting Zephyr OS build zephyr-v3.5.0-5399-g8acca664d29f ***
[00:00:00.165,313] <inf> udp_server: UDP server started
[00:00:21.538,726] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.554,901] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.554,931] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.579,193] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.579,254] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.579,284] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.603,454] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.603,515] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.603,546] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.603,576] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.670,135] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.670,196] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.670,257] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.670,288] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.670,349] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.688,598] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.688,659] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.688,690] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.688,751] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.688,781] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
[00:00:21.688,842] <wrn> ieee802154_cc13xx_cc26xx_subg: Cannot allocate packet
As might be visible in server config, I have set the buffer counts to 64 which I think should be more than sufficient since packets are being sent with 50 ms intervals.
Once in a blue moon (1 out of 30 runs), I actually receive all (100) the packets, which seems to suggest that programs can work.
- Target: beagleconnect_freedom
To Reproduce Steps to reproduce the behavior:
- Build and flash server to one beagleconnect freedom
- Build and flash client to the other beagleconnect freedom
- Open serial connection.
Expected behavior All packets to be received by the server.
Environment (please complete the following information):
- OS: Linux
- Toolchain Zephyr SDK 0.16.5
- Zephyr Commit SHA: 8acca664d29fe5c22fe0bbcf704ea9b242982964
Additional context I will look if I missed something when I added network support in https://github.com/zephyrproject-rtos/zephyr/pull/65048 .
About this issue
- Original URL
- State: open
- Created 5 months ago
- Comments: 20 (19 by maintainers)
@fgrandel Thanks for the explanation. Switching to ALOHA seems get me around 56 kbps in the zperf test. Not completely sure how that will play with greybus though so let’s see.
It has been fixed by https://github.com/zephyrproject-rtos/zephyr/pull/69098
Hi @Ayush1325 !
Sorry that I chime in only now, I was ill and couldn’t read my messages before.
The commit you point to fixes a bug in the CC13/26xx SubG driver which announced a “hard MAC” CSMA/CA capability which it didn’t implement. The change enables the L2 stack’s “soft MAC” CSMA/CA implementation instead.
What you’re observing might just be standard-conforming random CSMA/CA backoffs that intentionally reduce throughput to enhance co-existence. This can be “fixed” by switching CSMA/CA off. The fact that some packages are delivered without additional latency may just be due to the fact that CSMA/CA chooses not to backoff at random intervals.
I did extensive regression testing on that change, see https://github.com/zephyrproject-rtos/zephyr/pull/58439 and above all https://github.com/zephyrproject-rtos/zephyr/pull/58439#issuecomment-1585746656.
There you’ll also find an in-depth explanation and demo of the increased latencies due to CSMA/CA as well as sample configurations with CSMA/CA turned off that prove no regression in throughput.
Not sure why you see package drops, though? Some timeout maybe that does not take potential max backoffs into account?
Please let me know if I’m misinterpreting something here. I didn’t have time to look deeply into all prior comments.
Hope this helps?