uv: Repeated timeouts in GitHub Actions fetching wheel for large packages

In the last few days since switching to uv, I have seen errors that I have not seen before with pip.

I see:

error: Failed to download distributions
  Caused by: Failed to fetch wheel: torch==2.2.1
  Caused by: Failed to extract source distribution
  Caused by: request or response body error: operation timed out
  Caused by: operation timed out
Error: Process completed with exit code 2.

I see this on the CI for vws-python-mock, which requires installing 150 packages:

uv pip install --upgrade --editable .[dev]
...
Resolved 150 packages in 1.65s
Downloaded 141 packages in 21.41s
Installed 150 packages in 283ms

I do this in parallel across many jobs on GitHub Actions, mostly on ubuntu-latest.

This happened with torch 2.2.0 before the recent release of torch 2.2.1. It has not happened with any other dependencies. The wheels for torch are pretty huge: https://pypi.org/project/torch/#files.

uv is always at the latest version as I run curl -LsSf https://astral.sh/uv/install.sh | sh. In the most recent example, this is uv 0.1.9.

Failures:

About this issue

Original URL
State: closed
Created 4 months ago
Reactions: 7
Comments: 18 (7 by maintainers)

Commits related to this issue

ci: switch back from uv to pip Reverts c59f0ca1ce5113c650a123fbe9634b3c27c42cc7 (#13) Too many CI test timeouts from installing torch/nvidia packages with uv: https://github.com/astral-sh/uv/issues/... — committed to idiap/coqui-ai-TTS by eginhard 3 months ago
ci: switch back from uv to pip Reverts c59f0ca1ce5113c650a123fbe9634b3c27c42cc7 (#13) Too many CI test timeouts from installing torch/nvidia packages with uv: https://github.com/astral-sh/uv/issues/... — committed to idiap/coqui-ai-TTS by eginhard 3 months ago
Enforce HTTP timeouts on a per-read (rather than per-request) basis (#3144) ## Summary This leverages the new `read_timeout` property, which ensures that (like pip) our timeout is not applied to ... — committed to astral-sh/uv by charliermarsh 2 months ago

Most upvoted comments

I have changed the title of this to not reference torch. It recently happened with nvidia-cudnn-cu12, another large download.

As another example, https://github.com/VWS-Python/vws-python-mock/actions/runs/8262236134 has 7 failures in one run.

adamtheturtle on Mar 13, 2024

Going to close for now, but we can re-open if this comes up again post-changing the timeout semantics.

charliermarsh on May 1, 2024

I encountered the problem when I used either uv or pip to download large wheels (for pip, the issue is https://github.com/pypa/pip/issues/4796 and https://github.com/pypa/pip/issues/11153), so I think the root cause is the network. However, I am wondering if uv can be smarter to retry automatically, like something in https://github.com/pypa/pip/pull/11180.

njzjz on Apr 19, 2024

It can happen on Read the Docs as well, not only GHA https://beta.readthedocs.org/projects/kedro-datasets/builds/23790543/

astrojuanlu on Mar 18, 2024

I’ll close it for now, please feel free to reopen should it reoccur

konstin on Mar 1, 2024

In #1921 my co-worker noted that this might be a bug in the way we’re specifying the timeout so I’ll recategorize this one and leave it open.

zanieb on Feb 23, 2024

Thanks for the feedback, I’ve opened issues for your requests

zanieb on Feb 23, 2024