rcl: rcl_action: result_timeout should be started on goal completion

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04 (Any)
  • Installation type:
    • Any but Iron and Rolling
  • Version or commit hash:
    • N.A
  • DDS implementation:
    • Any
  • Client library (if applicable):
    • Any (rclcpp, rclpy)

Description

If the user application uses default rcl_action_server_options_t, goal_handle will be considered as expired after 10 seconds.

as described in https://github.com/ros-planning/navigation2/issues/3765, that is so likely to take more than 10 seconds to set the goal result before expired(10 seconds) once accepted. https://github.com/ros-planning/navigation2/issues/3765#issuecomment-1689951628 analyzes and work-around this issue with setting 30 seconds via rcl_action_server_options_t.result_timeout.

Consideration / Proposal

  • rcl_action : increase the timeout from 10 seconds to 1 minute in default. (15 minutes are too long though.) https://github.com/ros2/rcl/pull/1012 reduced the timeout into 10 seconds, but thinking about the use case such as Nav2 relies on ROS 2 action, 10 seconds is short in default. (backport required to Iron)
  • rclpy_action : result_timeout default should be set to rcl’s default accordingly. (currently this is set to 900 seconds.)

Related Issue

About this issue

  • Original URL
  • State: open
  • Created 9 months ago
  • Comments: 20 (7 by maintainers)

Most upvoted comments

While goal processing time is very short (just send goal accept to client), client doesn’t send goal result to service. Current implementation must get the response of goal accept and then can send request of goal result. That is, we cannot make sure that client sends the result request before server complete the goal.

I agree with you that a goal result could be missed if the goal executed very fast and the result timeout was zero, but I think that’s a case of a missconfigured goal timeout in the application rather than a bug that needs to be fixed here.

server can keep the result after completion, at least one client requests the result. (what if no client requests the result? caching unnecessary and old result would be problem.) how about sending the goal request with result requested flag. (this client might be gone after sending the goal request, the result will never be delivered to client.)

I’ve also considered the above solution, and as you described, there are some unavoidable issues. This is because it’s impossible to determine when the service will receive the client’s request result. How long to retain the goal result is a problem.

I tend to agree with sloretz’s point of view. This should be resolved by asking users to set an appropriate timeout.

Understood. there will be always racy condition between result requested and goal completed in this case. I think that is why we have own timeout option that server can manage by itself, because clients might be gone already.

saying, for example,

  • server can keep the result after completion, at least one client requests the result. (what if no client requests the result? caching unnecessary and old result would be problem.)
  • how about sending the goal request with result requested flag. (this client might be gone after sending the goal request, the result will never be delivered to client.)

so i think current design with timeout on server after goal completion makes sense. but i am open for more options and ideas 👍 thanks

One minute is better yet than 10 seconds, but doesn’t really address the underlying problem. We need a different mechanic to expire goals that isn’t based on request time (e.g. last-updated time? last-result-requested time?)

But, I’ll take incremental improvements where I can get it. Nav2’s workaround of increasing the time solves my immediate problems like this, but still leaves every other user that isn’t extremely well plugged into the on-goings of Nav2 / rclcpp development in the lurch. So for their sake to help mask more of the problem in the meantime, I would be very supportive of a move up to 1 minute.

+1 this is a good suggestion