rclpy: `MultiThreadedExecutor:spin_until_future_complete` can block when the future is ready
Bug report
Required Info:
- Operating System:
- Ubuntu 20.04
- Installation type:
- From source
- Version or commit hash:
- Foxy
- DDS implementation:
- Fast-RTPS
- Client library (if applicable):
- rclpy
Steps to reproduce the issue
- Publish a message from one node (with latching_qos)
- In another node, subscribe to the topic with a callback that sets the result of the future
- Start both nodes in a
MuliThreadedExecutor - Spin until future complete
import rclpy
from rclpy.executors import MultiThreadedExecutor, SingleThreadedExecutor
from rclpy.node import Node
from std_msgs.msg import String
from rclpy.qos import QoSProfile, QoSDurabilityPolicy
from rclpy.task import Future
import os
latching_qos = QoSProfile(depth=1,
durability=QoSDurabilityPolicy.RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL)
def main():
rclpy.init()
# Set up publisher
pubnode = Node('pubnode_' + str(os.getpid()))
pub1 = pubnode.create_publisher(String, 'topic1', latching_qos)
msg1 = String()
msg1.data = "hello1"
pubnode.get_logger().info("Publishing hello1")
pub1.publish(msg1)
# Set up listener
future_msgs = Future()
subnode = Node('subnode_' + str(os.getpid()))
subnode.create_subscription(String, 'topic1', lambda msg : ([
subnode.get_logger().info("Received message on topic1"),
future_msgs.set_result(msg)
]), latching_qos)
# Start nodes
exe = MultiThreadedExecutor()
exe.add_node(pubnode)
exe.add_node(subnode)
future_msgs.add_done_callback(lambda fut : print("Future is done"))
exe.spin_until_future_complete(future_msgs)
if __name__ == '__main__':
main()
Expected behavior
The subnode should receive the message, set the future as complete, and then the program should exit.
Actual behavior
The subnode receives the message, sets the future as complete, but the exe.spin_until_future_complete(future_msgs) never returns.
Additional information
This only happens with the MultiThreadedExector. If I swap this out for the SingleThreadedExecutor then it works as expected.
If the pub node is running in a different process, it also works as expected.
I have also asked a question on ROS Answers here.
Here is a workspace that you can clone and run to immediately test this issue. Here is the code for the node defined in that workspace. It is very similar to the code posted in this bug report, but with a few argparse options.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17 (6 by maintainers)
As @fujitatomoya explained here, there’s no deadlock. I’ve updated the PR title accordingly.
https://github.com/ros2/rclpy/pull/605 is a proposed fix. @craigh92 @fujitatomoya it would be great if you can confirm that the proposed fix solves the issue in the posted example, thanks!
sure @fujitatomoya I will do so.
IMO, custom executor should define
spin_impl()butspin_once(),spin_once()can be method of Executor, spin just once with single thread.spin()of Executor calls custom executor’sspin_impl(), which is dependent on the implementation.i may be missing something, I’d like to hear from the others.
This is NOT deadlock issue, main thread is just waiting via
rcl_wait.futureis actually doneset_resultbyexecutor.submit(callback)but beforefuturegetsset_result, main thread will callspin_once(). Then it waits onrcl_waitfor the next ready event. Thisrcl_waitwill never be fired with this sample program, because there is nothing to do and timeout is not set either.i am not sure how to fix it, any thoughts?
MultiThreadedExecutor::spin_once?spin_some?btw, setting the timeout for
spin_until_future_completewill avoid this problem.