ecal: Service client `Call` and `CallAsync` APIs do not have a timeout argument

Describe the bug The APIs in ecal_client.h for CServiceClient::Call and CServiceClient::CallAsync do not have a timeout option. It doesn’t appear the user has any way of bounding the time that the operation takes. This makes it more challenging for application code to be robust against missing or unresponsive services.

To Reproduce

  1. Initiate a Call to a very long running service handler

Expected behavior The client call fails after a user-specified timeout and signals the timeout to the application.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Version 20.04

Additional context Add any other context about the problem here.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (14 by maintainers)

Commits related to this issue

Most upvoted comments

I’m using with success this feature, I hope it will be incorporated in the next official release.

Glad to hear that. This fix will be part of eCAL 5.10.

I’m using with success this feature, I hope it will be incorporated in the next official release.

I did attempt to wrap eCAL’s service client with timeout functionality. It hasn’t been rigorously tested: https://github.com/agtonomy/trellis/blob/master/trellis/core/service_client.hpp

The timer that’s created on line 45 is asio-based. The eCAL service call response handler uses asio::post to invoke the user callback on the event loop.

This code demonstrates the high-level strategy where the timeout timer callback retains a handle to the client, and the client response callback retains a handle to the timer. Whichever one fires first aborts the other and calls back to the user. Note the comment on line 48 because there wasn’t an API to abort a pending async call.

Hi @rex-schilasky I appreciate the update! I’m happy to look at any code you point me toward and/or discuss implementation details. Cheers.

Oh, and I should mention that the timer and the socket operation should happen on the same event loop (same io_context) so that the timer expiration and the IO operation can’t truly simultaneously complete. One always comes first. This avoids potential race conditions.