prefect: Orion: Running flows in Docker results in an error
Description
I ran into an error when I tried to run the example flow in Docker. The flow:
from prefect import flow
from prefect.deployments import DeploymentSpec
from prefect.flow_runners import DockerFlowRunner
@flow
def my_flow():
print("Hello from Docker!")
DeploymentSpec(
name="example",
flow=my_flow,
flow_runner=DockerFlowRunner()
)
the error I received after creating deployment, work queue and an agent:
prefect agent start '5d76cc30-165e-4086-af9e-5fec6b4d7614'
/home/ubuntu/.local/lib/python3.8/site-packages/prefect/context.py:360: UserWarning: Temporary environment is overriding key(s): PREFECT_API_URL
with temporary_environ(
Starting agent connected to http://127.0.0.1:4200/api...
___ ___ ___ ___ ___ ___ _____ _ ___ ___ _ _ _____
| _ \ _ \ __| __| __/ __|_ _| /_\ / __| __| \| |_ _|
| _/ / _|| _|| _| (__ | | / _ \ (_ | _|| .` | | |
|_| |_|_\___|_| |___\___| |_| /_/ \_\___|___|_|\_| |_|
Agent started!
12:28:38.236 | INFO | prefect.agent - Submitting flow run '1be26278-ed55-4ed5-9dab-5be873047c5e'
12:28:38.287 | INFO | prefect.flow_runner.docker - Flow run 'crystal-waxbill' has container settings = {'image': 'prefecthq/prefect:2.0a13-python3.8', 'ne
twork': None, 'command': ['python', '-m', 'prefect.engine', '1be26278-ed55-4ed5-9dab-5be873047c5e'], 'environment': {'PREFECT_API_URL': 'http://host.docker.i
nternal:4200/api'}, 'auto_remove': False, 'labels': {'io.prefect.flow-run-id': '1be26278-ed55-4ed5-9dab-5be873047c5e'}, 'extra_hosts': {'host.docker.internal
': 'host-gateway'}, 'name': 'crystal-waxbill', 'volumes': []}
12:28:38.588 | INFO | prefect.agent - Completed submission of flow run '1be26278-ed55-4ed5-9dab-5be873047c5e'
12:28:38.599 | INFO | prefect.flow_runner.docker - Flow run container 'crystal-waxbill' has status 'running'
12:28:39.913 | ERROR | prefect.engine - Engine execution of flow run '1be26278-ed55-4ed5-9dab-5be873047c5e' exited with unexpected exception
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/anyio/_core/_sockets.py", line 127, in try_connect
stream = await asynclib.connect_tcp(remote_host, remote_port, local_address)
File "/usr/local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 1518, in connect_tcp
await get_running_loop().create_connection(StreamProtocol, host, port,
File "/usr/local/lib/python3.8/asyncio/base_events.py", line 1025, in create_connection
raise exceptions[0]
File "/usr/local/lib/python3.8/asyncio/base_events.py", line 1010, in create_connection
sock = await self._connect_sock(
File "/usr/local/lib/python3.8/asyncio/base_events.py", line 924, in _connect_sock
await self.sock_connect(sock, address)
File "/usr/local/lib/python3.8/asyncio/selector_events.py", line 496, in sock_connect
return await fut
File "/usr/local/lib/python3.8/asyncio/selector_events.py", line 528, in _sock_connect_cb
raise OSError(err, f'Connect call failed {address}')
ConnectionRefusedError: [Errno 111] Connect call failed ('172.17.0.1', 4200)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
yield
File "/usr/local/lib/python3.8/site-packages/httpcore/backends/asyncio.py", line 101, in connect_tcp
stream: anyio.abc.ByteStream = await anyio.connect_tcp(
File "/usr/local/lib/python3.8/site-packages/anyio/_core/_sockets.py", line 184, in connect_tcp
raise OSError('All connection attempts failed') from cause
OSError: All connection attempts failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
yield
File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request
raise exc
File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
response = await connection.handle_async_request(request)
File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection.py", line 86, in handle_async_request
raise exc
File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection.py", line 63, in handle_async_request
stream = await self._connect(request)
File "/usr/local/lib/python3.8/site-packages/httpcore/_async/connection.py", line 111, in _connect
stream = await self._network_backend.connect_tcp(**kwargs)
File "/usr/local/lib/python3.8/site-packages/httpcore/backends/auto.py", line 23, in connect_tcp
return await self._backend.connect_tcp(
File "/usr/local/lib/python3.8/site-packages/httpcore/backends/asyncio.py", line 101, in connect_tcp
stream: anyio.abc.ByteStream = await anyio.connect_tcp(
File "/usr/local/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.8/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
raise to_exc(exc)
httpcore.ConnectError: All connection attempts failed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/prefect/engine.py", line 951, in <module>
enter_flow_run_engine_from_subprocess(flow_run_id)
File "/usr/local/lib/python3.8/site-packages/prefect/engine.py", line 134, in enter_flow_run_engine_from_subprocess
return anyio.run(retrieve_flow_then_begin_flow_run, flow_run_id)
File "/usr/local/lib/python3.8/site-packages/anyio/_core/_eventloop.py", line 56, in run
return asynclib.run(func, *args, **backend_options)
File "/usr/local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 233, in run
return native_run(wrapper(), debug=debug)
File "/usr/local/lib/python3.8/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 228, in wrapper
return await func(*args)
File "/usr/local/lib/python3.8/site-packages/prefect/client.py", line 81, in with_injected_client
return await fn(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/prefect/engine.py", line 194, in retrieve_flow_then_begin_flow_run
flow_run = await client.read_flow_run(flow_run_id)
File "/usr/local/lib/python3.8/site-packages/prefect/client.py", line 1132, in read_flow_run
response = await self.get(f"/flow_runs/{flow_run_id}")
File "/usr/local/lib/python3.8/site-packages/prefect/client.py", line 333, in get
response = await self._client.get(route, **kwargs)
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1729, in get
return await self.request(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1506, in request
return await self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1593, in send
response = await self._send_handling_auth(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1621, in _send_handling_auth
response = await self._send_handling_redirects(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1658, in _send_handling_redirects
response = await self._send_single_request(request)
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1695, in _send_single_request
response = await transport.handle_async_request(request)
File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/usr/local/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.8/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ConnectError: All connection attempts failed
12:28:40.307 | INFO | prefect.flow_runner.docker - Flow run container 'crystal-waxbill' has status 'exited'
The status of the flow run after the error changes to Pending
.
Reproduction / Example
prefect orion start
prefect deployment create ./example-deployment.py
prefect deployment run my-flow/example
prefect deployment inspect my-flow/example
prefect work-queue create -d <DEPLOYMENT-ID> test
prefect agent start <WORK-QUEUE-ID>
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 3
- Comments: 16 (10 by maintainers)
By default, we use the “host” network mode on Linux when we detect that you’re running an API on localhost because it makes it easier for users to get started. In this mode, your container can access anything on the network outside the container. Because of this, we do not change your API URL, it should just work. It looks like something’s gone wrong with that, but we haven’t gotten other reports of it and it’s working fine in our tests so it’ll take some investigation to determine what.
If you use the “bridge” network mode, the container is isolated and cannot talk to localhost. When we detect use of “bridge” with a local API, we need to convert your API URL to use “host.docker.internal” which lets it communicate outside of the container. This is a little more complicated, so we don’t do it by default when the “host” mode is available.
Hi! Thanks for the well written issue.
We’re tracking an update to the tutorial internally at https://github.com/PrefectHQ/orion/issues/1123 This is similar to https://github.com/PrefectHQ/prefect/issues/4963 and https://github.com/PrefectHQ/prefect/pull/5182
The issue here is that the requests from the container being denied because the API is bound to 127.0.0.1 instead of 0.0.0.0. If you run
prefect orion start --host 0.0.0.0
, the container will be able to reach the API. This is a difference in container networking on Linux (this is not an issue on macOS).We solved this in V1 of Prefect by including the container in a shared network with the server, but the server is not running in a container right now so we cannot apply the same fix. I’m not sure what the long-term solution is, it’d be nice to be able to use it without exposing your API via 0.0.0.0.
I believe you can also bind the Docker host IP
172.17.0.1
and it will allow connections from containers but won’t allow connections from other sources.