rmw_fastrtps: Exception sending message over network
Bug report
Required Info:
- Operating System:
- Ubuntu 18.04
- Installation type:
- source
- Version or commit hash:
- DDS implementation:
- fast rtps? I’m not sure of having chosen one
- Client library (if applicable):
- rclcpp
Steps to reproduce issue
I’m publishing odometry data from a robot, it is a differential drive one, using mostly turtlebot3 as example, many messages get published then randomly after some seconds between 5 to more than 200 may be it will throw an exception.
Expected behavior
No exceptions!
Actual behavior
The communication is throwing an exception
terminate called after throwing an instance of ‘rclcpp::exceptions::RCLError’ what(): failed to publish message: cannot publish data, at /home/rosbert/ros2_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_publish.cpp:52, at /home/rosbert/ros2_ws/src/ros2/rcl/rcl/src/rcl/publisher.c:257
Additional information
48: rmw_fastrtps_shared_cpp::SerializedData data;
49: data.is_cdr_buffer = false;
50: data.data = const_cast<void *>(ros_message);
51: if (!info->publisher_->write(&data)) {
52: RMW_SET_ERROR_MSG("cannot publish data");
53: return RMW_RET_ERROR;
54: }
....
rcl_ret_t
rcl_publish(
const rcl_publisher_t * publisher,
const void * ros_message,
rmw_publisher_allocation_t * allocation)
{
if (!rcl_publisher_is_valid(publisher)) {
return RCL_RET_PUBLISHER_INVALID; // error already set
}
RCL_CHECK_ARGUMENT_FOR_NULL(ros_message, RCL_RET_INVALID_ARGUMENT);
if (rmw_publish(publisher->impl->rmw_handle, ros_message, allocation) != RMW_RET_OK) {
257: RCL_SET_ERROR_MSG(rmw_get_error_string().str);
return RCL_RET_ERROR;
}
return RCL_RET_OK;
}
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 24 (14 by maintainers)
I was able to reproduce the issue using your commands:
The publication fails because it exceeds the
max_blocking_timeQoS (100 milliseconds by default). We will investigate why it is blocked so much time. Thanks for your help.Until now the solution was able only in 1.9.x. I’ve created eProsima/Fast-RTPS#730 with the backport to 1.8.x. By default STRICT_REALTIME will be OFF. I hope this helps you.
@ivanpauno Currently there is no way to know if writing fails because of a timeout. It is still old API. We are working on have the standard DDS PSM C++ API (working branch). This API contemplates that case.
@quhezheng What version of FastRTPS are you using? v1.8.x? We can make a back port of eProsima/Fast-RTPS#718 to v1.8.x.
@richiware Many thanks. I wend to back to 1.7.0, there was no locking op. I don’t understand what the lock is protecting? Can we remove the lock OP for short term in our local code base? What’s consequence would it bring if remove it??
We are running ros2 on QNX on Nvida xavier. Too many processes are eating up the CPU, the exception is killing us. Can it be a reason triggered the time out ? We need to have a short term solution to meet the deadline of our product.
Yes, current version 1.9.0 uses reliability QoS
max_blocking_timeto maintain strict real-time, both reliable and besteffort. Sure not all users want this feature. We met and decided to make this configurable. This will be released in v1.9.1 via eProsima/Fast-RTPS#718. If not configured strict real-time, the behaviour ofmax_blocking_timewill be the defined by DDS standard.