prefect: Add retry for wait_for_flow_run in case of ECONNRESET

Current behavior

Currently the wait_for_flow_run can sometimes fail with a connection error.

Task "Wait for flow: Start flow: 'TL - Speed Analysis: Alarm'": Exception encountered during task execution!
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 880, in get_task_run_state
    value = prefect.utilities.executors.run_task_with_timeout(
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/utilities/executors.py", line 468, in run_task_with_timeout
    return task.run(*args, **kwargs)  # type: ignore
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/tasks/prefect/flow_run.py", line 266, in wait_for_flow_run
    for log in watch_flow_run(
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 90, in watch_flow_run
    flow_run = flow_run.get_latest()
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 414, in get_latest
    return self.from_flow_run_id(
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 571, in from_flow_run_id
    flow_run_data = cls._query_for_flow_run(where={"id": {"_eq": flow_run_id}})
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 613, in _query_for_flow_run
    result = client.graphql(flow_run_query)
  File "/root/miniconda3/lib/python3.8/site-packages/prefect/client/client.py", line 473, in graphql
    raise ClientError(result["errors"])
prefect.exceptions.ClientError: [{'path': ['flow_run'], 'message': 'request to http://hasura:3000/v1alpha1/graphql failed, reason: read ECONNRESET', 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'request to http://hasura:3000/v1alpha1/graphql failed, reason: read ECONNRESET', 'type': 'system', 'errno': 'ECONNRESET', 'code': 'ECONNRESET'}}}]

Proposed behavior

Retry in case of ECONNRESET

Example

To prevent failed flow error due to ECONNRESET

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (11 by maintainers)

Most upvoted comments

I’ve drafted a possible server-side solution in https://github.com/PrefectHQ/server/pull/373 — we’ll need to investigate further though.

@madkinsz I haven’t encountered it after upgrading. But to be fair it only happened a few times earlier.

Hi there. Any confirmation that #5825 solved this? I’m currently getting this error intermittently on flows that call other flows. It’s running under Prefect Core 1.2.2 and still getting the error.