prefect: Add retry for wait_for_flow_run in case of ECONNRESET
Current behavior
Currently the wait_for_flow_run
can sometimes fail with a connection error.
Task "Wait for flow: Start flow: 'TL - Speed Analysis: Alarm'": Exception encountered during task execution!
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 880, in get_task_run_state
value = prefect.utilities.executors.run_task_with_timeout(
File "/root/miniconda3/lib/python3.8/site-packages/prefect/utilities/executors.py", line 468, in run_task_with_timeout
return task.run(*args, **kwargs) # type: ignore
File "/root/miniconda3/lib/python3.8/site-packages/prefect/tasks/prefect/flow_run.py", line 266, in wait_for_flow_run
for log in watch_flow_run(
File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 90, in watch_flow_run
flow_run = flow_run.get_latest()
File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 414, in get_latest
return self.from_flow_run_id(
File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 571, in from_flow_run_id
flow_run_data = cls._query_for_flow_run(where={"id": {"_eq": flow_run_id}})
File "/root/miniconda3/lib/python3.8/site-packages/prefect/backend/flow_run.py", line 613, in _query_for_flow_run
result = client.graphql(flow_run_query)
File "/root/miniconda3/lib/python3.8/site-packages/prefect/client/client.py", line 473, in graphql
raise ClientError(result["errors"])
prefect.exceptions.ClientError: [{'path': ['flow_run'], 'message': 'request to http://hasura:3000/v1alpha1/graphql failed, reason: read ECONNRESET', 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'request to http://hasura:3000/v1alpha1/graphql failed, reason: read ECONNRESET', 'type': 'system', 'errno': 'ECONNRESET', 'code': 'ECONNRESET'}}}]
Proposed behavior
Retry in case of ECONNRESET
Example
To prevent failed flow error due to ECONNRESET
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 20 (11 by maintainers)
I’ve drafted a possible server-side solution in https://github.com/PrefectHQ/server/pull/373 — we’ll need to investigate further though.
@madkinsz I haven’t encountered it after upgrading. But to be fair it only happened a few times earlier.
Hi there. Any confirmation that #5825 solved this? I’m currently getting this error intermittently on flows that call other flows. It’s running under Prefect Core 1.2.2 and still getting the error.