rmw_fastrtps: insufficient performance in the QoS demo using default parameters

The scope of this ticket is focus on the performance of the QoS demo with default values between publisher and subscriber with the same RMW implementation. The cross vendor results are only mentioned for completeness / context.

To reproduce run the easiest example from the QoS demo:

  • ros2 run image_tools cam2image -b
  • ros2 run image_tools showimage

Which means:

  • reliable (default)
  • queue depth: 10 (default)
  • publish frequency in Hz: 30 (default)
  • history QoS setting: keep all the samples (default)

The following results show significant differences in the performance depending on which RMW implementation is chosen on the publisher and subscriber side (collected with the default branches on Ubuntu 16.04 on a Lenovo P50). Only the diagonal highlighted with “quality” colors is of interest for now:

Sub \ Pub FastRTPS Connext OpenSplice
FastRTPS #ffa000 Little stuttering, not smooth Flawless Severe increasing lag, multi-seconds within seconds
Connext One second burst, short hang, repeat #00ff00 Flawless One second burst, short hang, repeat
OpenSplice Flawless Stuttering #00ff00 Flawless

When increasing the image size to -x 640 -y 480 the problems become even more apparent:

Sub \ Pub FastRTPS Connext OpenSplice
FastRTPS #f03c15 Much more stuttering Hangs for several seconds between bursts Severe increasing lag, multi-seconds within seconds
Connext One second burst, short hang, repeat #00ff00 Flawless One second burst, short hang, repeat
OpenSplice Flawless Smooth but significantly reduced framerate (even on publisher side) #00ff00 Flawless

The acceptance criteria to resolve this ticket are:

  • without changing the demo itself or the QoS settings
  • the FastRTPS publisher and subscriber should demonstrate a “good” performance which is comparable with the other vendors.

PS: please don’t suggest variations of the QoS parameters in this thread but keep this ticket focused on this very specific case. Other cases can be considered separately.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 26 (22 by maintainers)

Most upvoted comments

@dirk-thomas , @miguelcompany and I were working in a new release. We are integrating all new features, and also this patch, in our internal develop branch. I want to release it to master tomorrow Wednesday. Sorry for the inconveniences.

Thanks @richiware for looking into it. The behavior on the hotfix branch is much better indeed!

Could you breifly describe what are the tradeoffs being made using the new timer values?

Result on my XP15 using the new branch:

| Min | Max | Mean | Median | Stddev | fps – | – | – | – | – | – | – connext | 0.0300700665 | 0.0410900116 | 0.0318081648 | 0.0316700935 | 0.0010073107 | 30.2290620882 opensplice | 0.0100297928 | 0.0511300564 | 0.0221883707 | 0.019359827 | 0.0093266318 | 30.261684248 fastrtps | 0.002010107 | 0.1774599552 | 0.0352770948 | 0.0078701973 | 0.0464236037 | 30.6302342667 fastrtps_hotfix | 0.001853466 | 0.0287704468 | 0.0074550388 | 0.0068049431 | 0.0458932319 | 30.2791350882

image_demo_latency_hotfix_comparison

With the hotfix and for this test, Fast-RTPS has less latency than the other rmw implementations using default values: image_demo_latency_hotfix

Note: I haven’t tested other frequencies of message sizes

Result of the 640x480 test on the 3 rmw implementations with the following setup: Dell XPS15, Ubuntu 16.04, send/receive 400 images, image rendering enabled

ros2 run image_tools cam2image -b -x 640 -y 480

640x480:

| Min | Max | Mean | Median | Stddev | fps | sender CPU | receiver CPU – | – | – | – | – | – | – | – | – connext | 0.0300738811 | 0.041087389 | 0.0318081552 | 0.0316724777 | 0.0010073057 | 30.2290740409 | 50.00% | 50.00% opensplice | 0.010027647 | 0.051132679 | 0.022188526 | 0.019367218 | 0.0093266182 | 30.2616755364 | 36.00% | 31.00% fastrtps | 0.0020124912 | 0.1774580479 | 0.0352770768 | 0.0078625679 | 0.0464236557 | 30.6302253416 | 22.00% | 33.00%

Serious stuttering with FastRTPS, very smooth for the other ones. Below a graph of the latency for each rmw_implementation.

image