prefect: Orion: Running flows in Docker results in an error

Description

I ran into an error when I tried to run the example flow in Docker. The flow:

from prefect import flow
from prefect.deployments import DeploymentSpec
from prefect.flow_runners import DockerFlowRunner

@flow
def my_flow():
    print("Hello from Docker!")


DeploymentSpec(
    name="example",
    flow=my_flow,
    flow_runner=DockerFlowRunner()
)

the error I received after creating deployment, work queue and an agent:

prefect agent start '5d76cc30-165e-4086-af9e-5fec6b4d7614'
/home/ubuntu/.local/lib/python3.8/site-packages/prefect/context.py:360: UserWarning: Temporary environment is overriding key(s): PREFECT_API_URL
  with temporary_environ(
Starting agent connected to http://127.0.0.1:4200/api...

  ___ ___ ___ ___ ___ ___ _____     _   ___ ___ _  _ _____
 | _ \ _ \ __| __| __/ __|_   _|   /_\ / __| __| \| |_   _|
 |  _/   / _|| _|| _| (__  | |    / _ \ (_ | _|| .` | | |
 |_| |_|_\___|_| |___\___| |_|   /_/ \_\___|___|_|\_| |_|


Agent started!
12:28:38.236 | INFO    | prefect.agent - Submitting flow run '1be26278-ed55-4ed5-9dab-5be873047c5e'
12:28:38.287 | INFO    | prefect.flow_runner.docker - Flow run 'crystal-waxbill' has container settings = {'image': 'prefecthq/prefect:2.0a13-python3.8', 'ne
twork': None, 'command': ['python', '-m', 'prefect.engine', '1be26278-ed55-4ed5-9dab-5be873047c5e'], 'environment': {'PREFECT_API_URL': 'http://host.docker.i
nternal:4200/api'}, 'auto_remove': False, 'labels': {'io.prefect.flow-run-id': '1be26278-ed55-4ed5-9dab-5be873047c5e'}, 'extra_hosts': {'host.docker.internal
': 'host-gateway'}, 'name': 'crystal-waxbill', 'volumes': []}
12:28:38.588 | INFO    | prefect.agent - Completed submission of flow run '1be26278-ed55-4ed5-9dab-5be873047c5e'
12:28:38.599 | INFO    | prefect.flow_runner.docker - Flow run container 'crystal-waxbill' has status 'running'
12:28:39.913 | ERROR   | prefect.engine - Engine execution of flow run '1be26278-ed55-4ed5-9dab-5be873047c5e' exited with unexpected exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/anyio/_core/_sockets.py", line 127, in try_connect
    stream = await asynclib.connect_tcp(remote_host, remote_port, local_address)
  File "/usr/local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 1518, in connect_tcp
    await get_running_loop().create_connection(StreamProtocol, host, port,
  File "/usr/local/lib/python3.8/asyncio/base_events.py", line 1025, in create_connection
    raise exceptions[0]
  File "/usr/local/lib/python3.8/asyncio/base_events.py", line 1010, in create_connection
    sock = await self._connect_sock(
  File "/usr/local/lib/python3.8/asyncio/base_events.py", line 924, in _connect_sock
    await self.sock_connect(sock, address)
  File "/usr/local/lib/python3.8/asyncio/selector_events.py", line 496, in sock_connect
    return await fut
  File "/usr/local/lib/python3.8/asyncio/selector_events.py", line 528, in _sock_connect_cb
    raise OSError(err, f'Connect call failed {address}')
ConnectionRefusedError: [Errno 111] Connect call failed ('172.17.0.1', 4200)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
    yield
  File "/usr/local/lib/python3.8/site-packages/httpcore/backends/asyncio.py", line 101, in connect_tcp
    stream: anyio.abc.ByteStream = await anyio.connect_tcp(
  File "/usr/local/lib/python3.8/site-packages/anyio/_core/_sockets.py", line 184, in connect_tcp
    raise OSError('All connection attempts failed') from cause
OSError: All connection attempts failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request
    raise exc
  File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
    response = await connection.handle_async_request(request)
  File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection.py", line 86, in handle_async_request
    raise exc
  File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection.py", line 63, in handle_async_request
    stream = await self._connect(request)
  File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection.py", line 111, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
  File "/usr/local/lib/python3.8/site-packages/httpcore/backends/auto.py", line 23, in connect_tcp
    return await self._backend.connect_tcp(
  File "/usr/local/lib/python3.8/site-packages/httpcore/backends/asyncio.py", line 101, in connect_tcp
    stream: anyio.abc.ByteStream = await anyio.connect_tcp(
  File "/usr/local/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.8/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc)
httpcore.ConnectError: All connection attempts failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/prefect/engine.py", line 951, in <module>
    enter_flow_run_engine_from_subprocess(flow_run_id)
  File "/usr/local/lib/python3.8/site-packages/prefect/engine.py", line 134, in enter_flow_run_engine_from_subprocess
    return anyio.run(retrieve_flow_then_begin_flow_run, flow_run_id)
  File "/usr/local/lib/python3.8/site-packages/anyio/_core/_eventloop.py", line 56, in run
    return asynclib.run(func, *args, **backend_options)
  File "/usr/local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 233, in run
    return native_run(wrapper(), debug=debug)
  File "/usr/local/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 228, in wrapper
    return await func(*args)
  File "/usr/local/lib/python3.8/site-packages/prefect/client.py", line 81, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/prefect/engine.py", line 194, in retrieve_flow_then_begin_flow_run
    flow_run = await client.read_flow_run(flow_run_id)
  File "/usr/local/lib/python3.8/site-packages/prefect/client.py", line 1132, in read_flow_run
    response = await self.get(f"/flow_runs/{flow_run_id}")
  File "/usr/local/lib/python3.8/site-packages/prefect/client.py", line 333, in get
    response = await self._client.get(route, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1729, in get
    return await self.request(
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1506, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1593, in send
    response = await self._send_handling_auth(
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1621, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1658, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1695, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: All connection attempts failed
12:28:40.307 | INFO    | prefect.flow_runner.docker - Flow run container 'crystal-waxbill' has status 'exited'

The status of the flow run after the error changes to Pending.

Reproduction / Example

prefect orion start
prefect deployment create ./example-deployment.py
prefect deployment run my-flow/example
prefect deployment inspect my-flow/example
prefect work-queue create -d <DEPLOYMENT-ID> test
prefect agent start <WORK-QUEUE-ID>

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 3
  • Comments: 16 (10 by maintainers)

Most upvoted comments

By default, we use the “host” network mode on Linux when we detect that you’re running an API on localhost because it makes it easier for users to get started. In this mode, your container can access anything on the network outside the container. Because of this, we do not change your API URL, it should just work. It looks like something’s gone wrong with that, but we haven’t gotten other reports of it and it’s working fine in our tests so it’ll take some investigation to determine what.

If you use the “bridge” network mode, the container is isolated and cannot talk to localhost. When we detect use of “bridge” with a local API, we need to convert your API URL to use “host.docker.internal” which lets it communicate outside of the container. This is a little more complicated, so we don’t do it by default when the “host” mode is available.

Hi! Thanks for the well written issue.

We’re tracking an update to the tutorial internally at https://github.com/PrefectHQ/orion/issues/1123 This is similar to https://github.com/PrefectHQ/prefect/issues/4963 and https://github.com/PrefectHQ/prefect/pull/5182

The issue here is that the requests from the container being denied because the API is bound to 127.0.0.1 instead of 0.0.0.0. If you run prefect orion start --host 0.0.0.0, the container will be able to reach the API. This is a difference in container networking on Linux (this is not an issue on macOS).

We solved this in V1 of Prefect by including the container in a shared network with the server, but the server is not running in a container right now so we cannot apply the same fix. I’m not sure what the long-term solution is, it’d be nice to be able to use it without exposing your API via 0.0.0.0.

I believe you can also bind the Docker host IP 172.17.0.1 and it will allow connections from containers but won’t allow connections from other sources.