prefect: ValueError: Path does not exist

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn’t find it.
  • I searched the Prefect documentation for this issue.
  • I checked that this issue is related to Prefect and not one of its dependencies.

Bug summary

I frequently get the error raise ValueError(f"Path {path} does not exist.") and I do not have any context on why it happens.

Reproduction

N/A

Error

Encountered exception during execution:
Traceback (most recent call last):
  File "./.venv/lib/python3.10/site-packages/prefect/engine.py", line 595, in orchestrate_flow_run
    result = await run_sync(flow_call)
  File "./.venv/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 57, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "./.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "./.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "./.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "././flows/etl/flow.py", line 81, in myflow
    val = myflow(
  File "./.venv/lib/python3.10/site-packages/prefect/flows.py", line 384, in __call__
    return enter_flow_run_engine_from_flow_call(
  File "./.venv/lib/python3.10/site-packages/prefect/engine.py", line 162, in enter_flow_run_engine_from_flow_call
    return run_async_from_worker_thread(begin_run)
  File "./.venv/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 137, in run_async_from_worker_thread
    return anyio.from_thread.run(call)
  File "./.venv/lib/python3.10/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "./.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "./.venv/lib/python3.10/site-packages/prefect/client.py", line 103, in with_injected_client
    return await fn(*args, **kwargs)
  File "./.venv/lib/python3.10/site-packages/prefect/engine.py", line 442, in create_and_begin_subflow_run
    flow_run.state.data._cache_data(await _retrieve_result(flow_run.state))
  File "./.venv/lib/python3.10/site-packages/prefect/results.py", line 38, in _retrieve_result
    serialized_result = await _retrieve_serialized_result(state.data)
  File "./.venv/lib/python3.10/site-packages/prefect/client.py", line 103, in with_injected_client
    return await fn(*args, **kwargs)
  File "./.venv/lib/python3.10/site-packages/prefect/results.py", line 34, in _retrieve_serialized_result
    return await filesystem.read_path(result.key)
  File "./.venv/lib/python3.10/site-packages/prefect/filesystems.py", line 183, in read_path
    raise ValueError(f"Path {path} does not exist.")
ValueError: Path /root/.prefect/storage/6543f194d24f4ad89685f79e67227492 does not exist.

Versions

Version:             2.3.1
API version:         0.8.0
Python version:      3.10.7
Git commit:          1d485b1d
Built:               Thu, Sep 1, 2022 3:53 PM
OS/Arch:             linux/x86_64
Profile:             default
Server type:         cloud

Additional context

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 3
  • Comments: 18 (7 by maintainers)

Most upvoted comments

I found one possible reproduction related to the tasks cache.

Infrastructure

  • self-hosted orion API on k8s
  • self-hosted agent on k8s
  • Docker image containing the flow code

Flow (flow.py)

from prefect import task, flow, get_run_logger
from prefect.tasks import task_input_hash

@task(cache_key_fn=task_input_hash)
def cached_task():
    return "cached value"

@flow
def test_cache():
    logger = get_run_logger()

    val = cached_task()
    logger.info(val)

if __name__ == "__main__":
    test_cache()

Test1

If we deploy this flow (deploy to orion API & push docker image) and run it once, it works well. But if we run it again, we hit the error:

...
raise ValueError(f"Path {path} does not exist.")
ValueError: Path /root/.prefect/storage/962e40aed4b6451a8f61a34b69e00c9c does not exist.

My guess is that prefect tries to fetch from the API the task cache path (the pod receive the PREFECT_API_URL env var) and obviously cannot resolve it locally on a fresh new k8s pod.

Test2

I tried to run the flow in docker locally and I also hit the error if I define the PREFECT_API_URL in the container.

Dockerfile used:

FROM python:3.8.13-slim-bullseye
RUN pip install prefect==2.4.1
WORKDIR /project
COPY flow.py /project/
ENV PREFECT_API_URL=https://my-public-hostname/api

Command used:

docker build --tag prefect-test-cache . && docker run prefect-test-cache python -m flow

If I remove the ENV instruction, the flow works well everytime.