prefect: RuntimeError: Cannot orchestrate task run. Failed to connect to API at http://127.0.0.1:4200/api/

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn’t find it.
  • I searched the Prefect documentation for this issue.
  • I checked that this issue is related to Prefect and not one of its dependencies.

Bug summary

when creating a flow that retrieves information such as JSON.load or other block load’s the flow after some time the orion server stops responding and the exception appears

Reproduction

from prefect import flow, task, tags
import asyncio
import random
import os
from prefect.blocks.system import JSON
from prefect_dask import DaskTaskRunner


@task
async def do_work(idx: int, block_name: str):
    config = await JSON.load(block_name)
    sleep_time = random.uniform(.5, 3.0)
    await asyncio.sleep(sleep_time) # simulating load with random times
    print(f"done with {idx}")
    return idx


@flow(task_runner=DaskTaskRunner())
def concurrent(block_name: str):
    with tags("workers"):
        work_futures = [ do_work.submit(index, block_name) for index in list(range(1,1024)) ]
        work_results = [ item.result() for item in work_futures ]
    print("the end")


if __name__ == "__main__":
    os.system("prefect concurrency-limit create workers 4")
    json_block = JSON(value={"the_answer": 42})
    json_block.save("test-block", overwrite=True)
    concurrent("test-block")

Error

18:26:49.101 | INFO    | Flow run 'rough-dove' - Created task run 'do_work-fc1fb0b6-146' for task 'do_work'
18:26:49.102 | INFO    | Flow run 'rough-dove' - Submitted task run 'do_work-fc1fb0b6-146' for execution.
18:26:49.186 | INFO    | Flow run 'rough-dove' - Created task run 'do_work-fc1fb0b6-392' for task 'do_work'
18:26:49.187 | INFO    | Flow run 'rough-dove' - Submitted task run 'do_work-fc1fb0b6-392' for execution.
18:26:49.339 | ERROR   | Flow run 'rough-dove' - Encountered exception during execution:
Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 227, in handle_async_request
    connection = await status.wait_for_connection(timeout=timeout)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 34, in wait_for_connection
    await self._connection_acquired.wait(timeout=timeout)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_synchronization.py", line 38, in wait
    await self._event.wait()
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 1842, in wait
    if await self._event.wait():
  File "/usr/lib/python3.9/asyncio/locks.py", line 226, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 377, in api_healthcheck
    await self._client.get("/health")
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1751, in get
    return await self.request(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1527, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 257, in send
    await super().send(*args, **kwargs)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1614, in send
    response = await self._send_handling_auth(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1642, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1679, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1716, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 232, in handle_async_request
    async with self._pool_lock:
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 378, in api_healthcheck
    return None
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_core/_tasks.py", line 118, in __exit__
    raise TimeoutError
TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 595, in orchestrate_flow_run
    result = await run_sync(flow_call)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 57, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/alburati/Proyectos/Aluxoft/orion/dask_concurent.py", line 22, in concurrent
    work_results = [ item.result() for item in work_futures ]
  File "/home/alburati/Proyectos/Aluxoft/orion/dask_concurent.py", line 22, in <listcomp>
    work_results = [ item.result() for item in work_futures ]
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/futures.py", line 225, in result
    return sync(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 240, in sync
    return run_async_from_worker_thread(__async_fn, *args, **kwargs)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 137, in run_async_from_worker_thread
    return anyio.from_thread.run(call)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 445, in result
    return self.__get_result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/futures.py", line 236, in _result
    return final_state.result(raise_on_failure=raise_on_failure)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/orion/schemas/states.py", line 143, in result
    raise data
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/task_runners.py", line 286, in _run_and_store_result
    self._results[key] = await call()
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 1078, in begin_task_run
    raise RuntimeError(
RuntimeError: Cannot orchestrate task run '1ead6d09-96d5-489b-afc9-cdf02479cc5d'. Failed to connect to API at http://10.0.1.6:4200/api/.
18:26:49.427 | INFO    | Flow run 'rough-dove' - Created task run 'do_work-fc1fb0b6-52' for task 'do_work'
18:26:49.428 | INFO    | Flow run 'rough-dove' - Submitted task run 'do_work-fc1fb0b6-52' for execution.
18:26:49.565 | INFO    | Flow run 'rough-dove' - Created task run 'do_work-fc1fb0b6-23' for task 'do_work'
18:26:49.565 | INFO    | Flow run 'rough-dove' - Submitted task run 'do_work-fc1fb0b6-23' for execution.
...
...
18:27:08.952 | INFO    | Flow run 'rough-dove' - Created task run 'do_work-fc1fb0b6-153' for task 'do_work'
18:27:08.953 | INFO    | Flow run 'rough-dove' - Submitted task run 'do_work-fc1fb0b6-153' for execution.
18:27:09.111 | INFO    | Flow run 'rough-dove' - Created task run 'do_work-fc1fb0b6-453' for task 'do_work'
18:27:09.111 | INFO    | Flow run 'rough-dove' - Submitted task run 'do_work-fc1fb0b6-453' for execution.
18:27:09.200 | ERROR   | Flow run 'rough-dove' - Crash detected! Request to http://10.0.1.6:4200/api/task_runs/ failed.
Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/backends/asyncio.py", line 33, in read
    return await self._stream.receive(max_bytes=max_bytes)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 1265, in receive
    await self._protocol.read_event.wait()
  File "/usr/lib/python3.9/asyncio/locks.py", line 226, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
    yield
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/backends/asyncio.py", line 35, in read
    return b""
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_core/_tasks.py", line 118, in __exit__
    raise TimeoutError
TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request
    raise exc
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
    response = await connection.handle_async_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection.py", line 90, in handle_async_request
    return await self._connection.handle_async_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/http11.py", line 105, in handle_async_request
    raise exc
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/http11.py", line 84, in handle_async_request
    ) = await self._receive_response_headers(**kwargs)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/http11.py", line 148, in _receive_response_headers
    event = await self._receive_event(timeout=timeout)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/http11.py", line 177, in _receive_event
    data = await self._network_stream.read(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/backends/asyncio.py", line 35, in read
    return b""
  File "/usr/lib/python3.9/contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc)
httpcore.ReadTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 1321, in report_flow_run_crashes
    yield
  File "/usr/lib/python3.9/contextlib.py", line 634, in __aexit__
    cb_suppress = await cb(*exc_details)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 926, in create_task_run_then_submit
    task_run = await create_task_run(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 965, in create_task_run
    task_run = await flow_run_context.client.create_task_run(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 1692, in create_task_run
    response = await self._client.post(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1842, in post
    return await self.request(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1527, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 257, in send
    await super().send(*args, **kwargs)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1614, in send
    response = await self._send_handling_auth(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1642, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1679, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1716, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/lib/python3.9/contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ReadTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_synchronization.py", line 38, in wait
    await self._event.wait()
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 1842, in wait
    if await self._event.wait():
  File "/usr/lib/python3.9/asyncio/locks.py", line 226, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
    yield
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_synchronization.py", line 38, in wait
    await self._event.wait()
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_core/_tasks.py", line 118, in __exit__
    raise TimeoutError
TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 234, in handle_async_request
    raise exc
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 227, in handle_async_request
    connection = await status.wait_for_connection(timeout=timeout)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 34, in wait_for_connection
    await self._connection_acquired.wait(timeout=timeout)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_synchronization.py", line 38, in wait
    await self._event.wait()
  File "/usr/lib/python3.9/contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc)
httpcore.PoolTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 103, in with_injected_client
    return await fn(*args, **kwargs)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 231, in create_then_begin_flow_run
    state = await begin_flow_run(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 367, in begin_flow_run
    terminal_state = await orchestrate_flow_run(
  File "/usr/lib/python3.9/contextlib.py", line 651, in __aexit__
    raise exc_details[1]
  File "/usr/lib/python3.9/contextlib.py", line 634, in __aexit__
    cb_suppress = await cb(*exc_details)
  File "/usr/lib/python3.9/contextlib.py", line 193, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 1328, in report_flow_run_crashes
    await client.set_flow_run_state(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 1612, in set_flow_run_state
    response = await self._client.post(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1842, in post
    return await self.request(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1527, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 257, in send
    await super().send(*args, **kwargs)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1614, in send
    response = await self._send_handling_auth(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1642, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1679, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1716, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/lib/python3.9/contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.PoolTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alburati/Proyectos/Aluxoft/orion/dask_concurent.py", line 30, in <module>
    concurrent("test-block")
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/flows.py", line 384, in __call__
    return enter_flow_run_engine_from_flow_call(
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/engine.py", line 158, in enter_flow_run_engine_from_flow_call
    return anyio.run(begin_run)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_core/_eventloop.py", line 70, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 292, in run
    return native_run(wrapper(), debug=debug)
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
    return await func(*args)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 103, in with_injected_client
    return await fn(*args, **kwargs)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/prefect/client.py", line 1929, in __aexit__
    return await self._exit_stack.__aexit__(*exc_info)
  File "/usr/lib/python3.9/contextlib.py", line 651, in __aexit__
    raise exc_details[1]
  File "/usr/lib/python3.9/contextlib.py", line 634, in __aexit__
    cb_suppress = await cb(*exc_details)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_client.py", line 1997, in __aexit__
    await self._transport.__aexit__(exc_type, exc_value, traceback)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpx/_transports/default.py", line 332, in __aexit__
    await self._pool.__aexit__(exc_type, exc_value, traceback)
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 326, in __aexit__
    await self.aclose()
  File "/home/alburati/Proyectos/Aluxoft/repos/scicarta-pipeline/p39/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 312, in aclose
    raise RuntimeError(
RuntimeError: The connection pool was closed while 1020 HTTP requests/responses were still in-flight.

Versions

Version:             2.3.2
API version:         0.8.0
Python version:      3.9.5
Git commit:          6e931ee9
Built:               Tue, Sep 6, 2022 12:36 PM
OS/Arch:             linux/x86_64
Profile:             default
Server type:         hosted

Additional context

The first time the example code is run it behaves differently than the following times that it is run. On the first run, the example code will finish: tasks print before raising exceptions. On the following runs, no prints happen before or after the exception.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (8 by maintainers)

Most upvoted comments

@john-jam thank you so much for the thorough information!

Do you have any minimum/recommended system requirements for a self-hosted orion API ?

I do not at this time. We are still working to optimize the server and client and haven’t benchmarked performance on different hardware. We’re hoping to get things to a point where the failure mode is a slowdown of your flows rather than a crash though. As @matthewbrookes noted, we have merged from improvements to the client in #7090, perhaps those will help.

@zangell44 does #7090 host our server with http2 support or will that require changes to our uvicorn invocation?

I suspect this is the same problem I’m having. If I launch a lot of tasks in parallel, prefect would crash. In my case increasing the healthcheck timeout from 10 to 60 seconds fixed the problem. From what I understood:

  1. The healthcheck failed to connect due to some resource starvation (orion.py / api_healthcheck())
  2. The api_healthcheck function returns the exception
  3. The agent that is running the flow throws a RuntimeError
  4. Flow execution aborts (RuntimeError) and since there were some pending connections it shows this “The connection pool was closed while 1020 HTTP requests/responses were still in-flight.” message and the “Crash detected! Execution was interrupted by an unexpected exception”.

The patch I did was in prefect/client/orion.py, changing fail_after timeout from 10 to 60:

...
try:
            with anyio.fail_after(60):
                await self._client.get("/health")
                return None
...

(Or maybe better use PREFECT_API_REQUEST_TIMEOUT here also?)

It’s possible something in the host could be configured to be capable of handling more connections?, but…

@discdiver the bug report