ros2cli: ros2 topic info -v prints incorrect QoS info for macOS + CycloneDDS
Bug report
Required Info:
- Operating System:
- macOS 10.14 Mojave
- Installation type:
- Fat binary from https://ci.ros2.org/view/packaging/
- Version or commit hash:
- Foxy prerelease
- DDS implementation:
- CycloneDDS
- Client library (if applicable):
- N/A
Steps to reproduce issue
ros2 topic info -v does not print correct QoS information for a topic.
➜ ~ ros2 topic pub /talker --qos-durability volatile std_msgs/String "data: Hello World volatile"
➜ ~ ros2 topic info -v /talker
Type: std_msgs/msg/String
Publisher count: 1
Node name: _CREATED_BY_BARE_DDS_APP_
Node namespace: _CREATED_BY_BARE_DDS_APP_
Topic type: std_msgs/msg/String
Endpoint type: PUBLISHER
GID: a3.6b.10.01.32.8a.d9.a4.bb.0c.3b.46.00.00.08.03.00.00.00.00.00.00.00.00
QoS profile:
Reliability: RMW_QOS_POLICY_RELIABILITY_RELIABLE
Durability: RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL
Lifespan: 2147483651294967295 nanoseconds
Deadline: 2147483651294967295 nanoseconds
Liveliness: RMW_QOS_POLICY_LIVELINESS_AUTOMATIC
Liveliness lease duration: 2147483651294967295 nanoseconds
Subscription count: 0
Expected behavior
Durability value printed should be: Durability: RMW_QOS_POLICY_DURABILITY_VOLATILE
Actual behavior
Durability value printed is: Durability: RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL
Additional information
These commands appear to work as expected on Linux and Windows for CycloneDDS.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 38 (16 by maintainers)
@clalancette
Anyway, I think it’s time to pull
eProsimain. I have created an issue on Fast-DDS.@iuhilnehc-ynos perhaps you need not do what I asked just now: after I published the comment, I realised that I could try this particular experiment of mixing a Cyclone DDS-based publisher with a Fast-RTPS-based ros2 cli. I get exactly the same result. That’s good news.
I’ve checked with Wireshark:
Cyclone only publishes QoS settings that are different from the default. In section 9.6.2.2.1 the DDSI-RTPS specification states that for parameters (e.g., durability QoS, a.k.a.
PID_DURABILITY) missing from a discovery message, the default should be applied, and then references the DDS specification for the defaultwhich is
volatileper section 2.1.3 of the DCPS spec, so Cyclone DDS is entirely correct:and Fast-RTPS or its RMW layer is misinterpreting the discovery data.
I suspect that when @fujitatomoya reproduced it, the ros2 daemon was running using Fast-RTPS. That one then stores the incorrect discovery data and regurgitates it to “ros2 topic info” even if that one is running Cyclone. Restarting the machine/container could easily have resulted in using Cyclone DDS for the daemon.
And if instead you use Cyclone DDS for the daemon, even Fast-RTPS gets it right:
@fujitatomoya
I have confirmed all these issues will be fixed in macOS after https://github.com/eProsima/Fast-DDS/pull/1384 is merged. (https://github.com/eProsima/Fast-DDS/pull/1382 is already merged.)
@iuhilnehc-ynos
This is an interesting. Quoting from section 9.6.2.2 in the DDSI-RTPS spec:
The participant GUID is entirely redundant because it is always the same as the reader/writer GUID with the entity id component replaced by 0x1c1. So in my view the normative text says the participant GUID should be omitted and the referenced table is therefore simply an incomplete list of “forbidden” items.
I guess one could hold a different opinion and argue that the intent behind that paragraph is that only those listed in table 9.10 should be left. But the fact of the matter is that neither OpenSplice nor Cyclone DDS has ever published it in the reader/writer information for a decade without it causing any interoperability issues, and I have furthermore checked some packet captures related to investigating interoperability issues in that decade that show Connext and CoreDX also leave it out. (That also means I would expect the same all-zero GUID to show when using Connext instead of Cyclone in this experiment.)
Finally, if eProsima is of the opinion that the participant GUID must be present in the reader/writer information, they should reject the discover data as invalid — maybe I should try fuzzing it? — and not use a nonsensical default value instead.
When the
CREATED_BY_BARE_DDS_APPproblem is observed, process_discovery_info RTPSParticipantKey() is all zero (confirmed). It should be something like rmw_gid_t1 64 16 1 -46 123 40 76 -73 127 -106 -118 0 0 1 -63 0 0 0 0 0 0 0 0which is the same rmw_gid_t with node_listener msg.gid.data.and the questions is why? @MiguelCompany could you kindly share your thought? this is really getting into dds implementation.
No worries, @iuhilnehc-ynos, there’re just too many details to be on top of them all.
Yes, that totally explains it
It is this paragraph:
(Page 169 of DDSI-RTPS 2.3, emphasis mine)
Table 9.11 then says “see DDS specification”, which is the big table in the DDS 1.4 spec starting on page 2-104.
It is because it isn’t actually set to the default value: the reliability QoS is a pair of kind (best-effort or reliable) and a “max blocking time” that says for how long the writer should block when resource limits prevent it from completing the write operation. The DDS default for “max blocking time” is 100ms, but the Cyclone RMW layer sets it to ∞ and that’s the reason it is included.
I now see that Wireshark doesn’t print the “max blocking time” in the table. It doesn’t affect reader/writer matching, so in that respect it is not very important, but it is a required part of the wire format so it is kinda odd that Wireshark omits it. (Required because the DDSI-RTPS spec says the type of the “reliability” parameter in the discovery is
ReliabilityQosPolicy, in turn defined in the DDS spec, although with a footnote that “The encoding of DDS::ReliabilityQoSPolicy::kind is defined by RTPS::ReliabilityKind_t (9.3.2)”.It’s the “ff ff ff 7f ff ff ff ff” bit in the highlighted bytes.
@fujitatomoya
Yes, I can reproduce this issue by your steps. (I agree we can use a different RMW to get information from others) I can also use the following steps to reproduce it in a container.
Foolish me … trying to be helpful giving copy-paste-ready lines and then forgetting the
export CYCLONEDDS_URIpart …Hopefully you can find one without much effort 🤞