rmw_fastrtps: FastRTPS 1.8.0 causes hangs in Navigation2
Bug report
Required Info:
- Operating System:
- Ubuntu 18.04
- Installation type:
- Source
- Version or commit hash:
- 0.7.2
- DDS implementation:
- Fast-RTPS master branch (1.8.0)
- Client library (if applicable):
- rclcpp
Steps to reproduce issue
Currently I have to run our Navigation2 system test to reproduce this, I’m trying to find a simpler example. However what I see is that when I run our system test with the latest versions (master branches) of rmw_fastrtps (0.7.2) and Fast-RTPS (1.8.0) our test hangs and times out. When I run with the previous versions (0.7.1) and (1.7.2) respectively, things work fine. Also if I run with RMW_IMPLEMENTATION=rmw_opensplice_cpp, things work fine then too.
I haven’t been able to isolate the problem. I can provide instructions for how to reproduce using the Nav2 system test if desired.
I see AMCL is stuck waiting for data on the /scan topic, but when I do a ros2 topic hz /scan I can see that the scan topic is being published correctly by gazebo. So it’s like the callback to AMCL is not being executed. I’m not sure how to debug that, but I’m pretty sure it’s in this rmw layer.
If anyone can offer some help or suggestions as to what to look at to debug this besides the fact that the topics are being executed, I’d appreciate the help.
This is high priority as it is blocking our CI. We won’t be able to release for Dashing in this state.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 29 (18 by maintainers)
@MiguelCompany - I tested and that change you suggested above definitely helps. I submitted a PR for it. Thanks for helping with that. I think the changes you made above helped also (eProsima/Fast-RTPS#541,).
Let me see if between those changes and this PR we get our CI to pass again and I’ll close this ticket.
We found the issue. It was related with a change necessary for the implementation of the lifespan QoS. A fix is on the way in eProsima/Fast-RTPS#541, a new blackbox test is being added in eProsima/Fast-RTPS#542, and a new unit test is under development.
I just tried the following and can confirm the hang with FastRTPS:
ros-dashing-*)/opt/ros/dashing/setup.bashros-planning/navigation2,ros-simulation/gazebo_ros_pkgs@ros2(will be released soon) andBehaviorTree/BehaviorTree.CPP@ros2(since its release failed to build) into a workspacewscolcon buildin that workspacews/install/setup.bashctest -V -R test_localizationhangs…@richiprosima You insight might be helpful on this ticket. This is using the latest commit from the
masterbranch of FastRTPS.