rclpy: Synchronous service client call broken in timer callback

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • foxy (the 2022-09-28 release that was just pushed through the apt repos)
  • DDS implementation:
    • Fast-RTPS
  • Client library (if applicable):
    • rclpy (of course)

Steps to reproduce issue

Use the minimal service server from the examples

from example_interfaces.srv import AddTwoInts

import rclpy
from rclpy.node import Node


class MinimalService(Node):

    def __init__(self):
        super().__init__('minimal_service')
        self.srv = self.create_service(AddTwoInts, 'add_two_ints', self.add_two_ints_callback)

    def add_two_ints_callback(self, request, response):
        response.sum = request.a + request.b
        self.get_logger().info('Incoming request\na: %d b: %d' % (request.a, request.b))

        return response


def main(args=None):
    rclpy.init(args=args)

    minimal_service = MinimalService()

    rclpy.spin(minimal_service)

    rclpy.shutdown()


if __name__ == '__main__':
    main()

and a slightly modified service client:

from example_interfaces.srv import AddTwoInts
import rclpy
from rclpy.node import Node

from rclpy.executors import MultiThreadedExecutor
from rclpy.callback_groups import MutuallyExclusiveCallbackGroup


class MinimalClient(Node):
    def __init__(self):
        super().__init__("minimal_client_async")
        self.cb = None
        self.timer_cb = MutuallyExclusiveCallbackGroup()
        self.cli = self.create_client(
            AddTwoInts, "add_two_ints", callback_group=self.cb
        )
        while not self.cli.wait_for_service(timeout_sec=1.0):
            self.get_logger().info("service not available, waiting again...")
        self.req = AddTwoInts.Request()

        # Create a timer for the main loop execution
        self.timer = self.create_timer(
            1.0, self.main_loop, callback_group=self.timer_cb
        )

    def send_request(self, a, b):
        self.get_logger().info("send_request: enter")
        self.req.a = a
        self.req.b = b

        # Only works once, then never executes the time again
        # self.future = self.cli.call_async(self.req)
        # rclpy.spin_until_future_complete(self, self.future)
        # self.get_logger().info("send_request: exit")
        # return self.future.result()

        # Used to work, but blocks now
        return self.cli.call(self.req)

    def main_loop(self) -> None:
        response = self.send_request(4, 2)
        self.get_logger().info(
            "Result of add_two_ints: for %d + %d = %d" % (4, 2, response.sum)
        )


def main(args=None):
    rclpy.init(args=args)

    minimal_client = MinimalClient()

    executor = MultiThreadedExecutor()
    executor.add_node(minimal_client)
    executor.spin()

    rclpy.shutdown()


if __name__ == "__main__":
    main()

Run the service server in one terminal and the client in another.

Expected behavior

You repeatedly see the client call the server every second and get a response.

Actual behavior

The client hangs at the first call

Additional information

Previous to the 2022-09-28 patch release of foxy, the above minimal client was functional. However, the call_async(...) version that is currently commented out didn’t work before in that it would run once and then never fire the timer again. This is despite the fact that in rclcpp this is exactly one of the recommended methods to implement this kind of behavior, and, from the documentation, would seem like the recommended way to do it inrclpy as well. It didn’t work, so we found an alternative method: calling the synchronous call(...) with a multi-threaded executor. As stated, this then broke with the push of the 2022-09-28 release. This seriously hampers our ability to use ROS2 as there is essentially no way to perform the above logic, and is a breaking change.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 4
  • Comments: 18 (2 by maintainers)

Commits related to this issue

Most upvoted comments

Other than just reverting in #1017, I would really appreciate adding a regression test to make sure this issue never happens again. For example, making sure nested callbacks with different callback groups work.

Yes, I agree. We wanted to get this fixed and out quickly, so we didn’t stop to do that, but a test would be ideal.

Would you like to submit a pull request that adds one?

@JakubNejedly You can get the patched rclpy version using this:

apt install ros-foxy-rclpy>=1.0.11-*

folks, really appreciate the all of the information and iterating with us.

https://github.com/ros2/rclpy/pull/1017 has been reverted including all distro foxy/galactic/humble, i will close this issue.

I can confirm replicating error on rolling mainline.

  • Operating System: Ubuntu 22.04
  • Installation type: binaries
  • Version or commit hash: rolling (the 2022-09-28 release that was just pushed through the apt repos) DDS implementation: Cyclone DDS Client library (if applicable): rclpy

So, while this should be fixed, I want to point out that relying on the MultiThreadedExecutor in this way is fragile. I think a better way which also works with single threaded executors would look like this (tested it on rolling):

diff --git a/test_client_orig.py b/test_client.py
index 323d585..c25bd96 100644
--- a/test_client_orig.py
+++ b/test_client.py
@@ -2,7 +2,6 @@ from example_interfaces.srv import AddTwoInts
 import rclpy
 from rclpy.node import Node
 
-from rclpy.executors import MultiThreadedExecutor
 from rclpy.callback_groups import MutuallyExclusiveCallbackGroup
 
 
@@ -23,7 +22,7 @@ class MinimalClient(Node):
             1.0, self.main_loop, callback_group=self.timer_cb
         )
 
-    def send_request(self, a, b):
+    async def send_request(self, a, b):
         self.get_logger().info("send_request: enter")
         self.req.a = a
         self.req.b = b
@@ -34,11 +33,11 @@ class MinimalClient(Node):
         # self.get_logger().info("send_request: exit")
         # return self.future.result()
 
-        # Used to work, but blocks now
-        return self.cli.call(self.req)
+        # Better way? using asyncio
+        return await self.cli.call_async(self.req)
 
-    def main_loop(self) -> None:
-        response = self.send_request(4, 2)
+    async def main_loop(self) -> None:
+        response = await self.send_request(4, 2)
         self.get_logger().info(
             "Result of add_two_ints: for %d + %d = %d" % (4, 2, response.sum)
         )
@@ -49,9 +48,7 @@ def main(args=None):
 
     minimal_client = MinimalClient()
 
-    executor = MultiThreadedExecutor()
-    executor.add_node(minimal_client)
-    executor.spin()
+    rclpy.spin(minimal_client)
 
     rclpy.shutdown()
 

That works fine, but it is still important in this situation that the timer and service client are in separate callback groups. Because while control is returned to the executor on the await lines, the callback for the timer is still not done yet and therefore it prevents other callbacks from the same callback group from being run, even if they are ready and the executor is idle.

I just found this as well. This breaks all usages of the MultiThreadedExecutor at the moment that rely on actual multithreading. I presume handle_wrapper(handler) could be replaced with partial(handle_wrapper, handler) to restore functionality.

This commit affects. It will work if you revert the following.

@@ -757,7 +759,11 @@ class MultiThreadedExecutor(Executor):
         except ConditionReachedException:
             pass
         else:
-            self._executor.submit(handler)
+            def handler_wrapper(handler):
+                handler()
+                if handler.exception() is not None:
+                    raise handler.exception()
+            self._executor.submit(handler_wrapper(handler))

about “handler_wrapper()”

  • if there is “handler()”, it will work single thread. (may be block occurs)
  • missing return value(“return handler”)