rclcpp: Publishing is slow in Docker with MutliThreadedExecutor
I am using ros2 foxy from the official ros2 docker images to build ros2_control and test their new implementations. I have noticed that when starting a joint state publisher that is set to publish at 200hz, monitoring the topic with ros2 topic hz /joint_states gives at best 30hz.
I know this has nothing to do specifically with ros2_control because I had a similar issue when trying my own lifecycle publisher nodes. Basically, attaching the node to a MultiThreadedExecutor produces a similar behavior where the topic is published at a much slower rate than expected, sometime by a factor of 10. Changing to a SingleThreadedExecutor solves the issue. Problem is ros2_control relies on this MultiThreadedExecutor.
I do believe this comes from the combination of Docker and MultiThreadedExecutor. I tested it on multiple computers and got similar behavior. I haven’t been able to test on a non docker installation as it requires Ubuntu 20.04 which I don’t have. But I will try it just in case.
Steps to reproduce:
- Use a Docker foxy image
- Create a LifecycleNode with a publisher and attach it to a
MultiThreadedExecutor - Monitor the topic with
hz
Alternatively, follow the ros2_control_demo installed on a Docker foxy image and monitor /joint_states.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 22 (5 by maintainers)
It seems https://github.com/ros2/rclcpp/pull/1168 already did the same thing, so I will not create a new PR for it, <del>maybe we could push https://github.com/ros2/rclcpp/pull/1168 to be merged ASAP.</del>
Updated: It seems
scheduled_timers_andscheduled_timers_mutex_have a long history. 👀 https://github.com/ros2/rclcpp/issues/1374#issue-714841851I have no idea of how to fix this, so I will unassign myself (cc @clalancette).
Well I have ideas of things to try, but the scope of this is much bigger of what I would expect from a randomly assigned issue. Maybe there’s an obvious fix I’m not seeing, but the whole executor code in
rclcppneeds some months of love really (not particularly to fix this issue, but there’s a lot of performance and thread safety issues too). I think that @iuhilnehc-ynos comment is a good summary of what have happened hereI will try to explain how I understand things to work and why that’s an issue:
My idea would be to limit how “executables” can be scheduled, so that when one worker has scheduled that “executable” for execution no other worker can take it as “ready”. That completely forbids the case of wanting to execute a callback of the same executable in parallel (e.g. two messages of same subscription), but I guess forbidding that is fine as that can potentially led to “out of order” execution. Anyway, that idea is also not simple to implement and I’m not completely sure that would solve the issue.
The single threaded executor also has its own problems:
@fujitatomoya
It definitely helped. Thanks. I am still a bit confused.
get_next_executableprotected by a mutexwait_mutex_make sureany_execgetting from two threads are different.Anyway, I’ll see related issues later.