uv: Small reads in uv_distribution::distribution_database::download producing v. slow download performance when installing torch

Hey, this is a successor to https://github.com/astral-sh/uv/pull/1978 as I’m still trying to integrate uv with Modal’s internal mirror. But this time I don’t yet have a solution to this problem 🙂.

I’m observed extremely degraded install performance when installing pytorch against our mirror. The install command is uv pip install torch.

On my test machine, installing without our mirror is fast, taking around 7 seconds.

image

But when I get our mirror involved, performance tanks and an install takes over 5 minutes. The command I’m running is this:

RUST_LOG="uv=trace,uv_extrace=trace,async_zip=debug" \
UV_NO_CACHE=true UV_INDEX_URL="http://172.21.0.1:5555/simple" \
uv --verbose pip install torch

Investigating…

Using debug logs on our mirror I see that there is a vast (~64x) difference in read sizes between uv pip install and pip install.

Our mirror serves .whl files off disk like this:

use hyper::{Body, Response};
use tokio_util::codec::{BytesCodec, FramedRead};

// ... 

let file tokio::fs::File::open(&filepath).await?;
debug!("cache hit: serving {} from disk cache", filepath.display());
let size = file.metadata().await.map(|m| Some(m.len())).unwrap_or(None);
let stream = FramedRead::new(file, BytesCodec::new());
let body = Body::wrap_stream(stream);
let mut response = Response::new(body);

By enabling debug logs (RUST_LOG=debug) I can see that when using pip install read chunks are large:

<kbd> image </kbd>

We can see in the screenshot of the debug logs that reads up to 128KiB are happening. Performance of the pip install command is good, the whl download takes around 3 seconds.

The pip command here was

PIP_TRUSTED_HOST="172.21.0.1:5555" \
PIP_NO_CACHE_DIR=off PIP_INDEX_URL="http://172.21.0.1:5555/simple" \
pip install torch

and pip --version is pip 23.3.1 from /home/ubuntu/modal/venv/lib/python3.11/site-packages/pip (python 3.11).

The read sizes of pip contrast strongly with uv pip install, where I’m seeing read sizes between 11 and 1120 bytes.

image

I believe these relatively tiny read sizes when downloading are contributing to the slow download performance on the 720.55MiB torch .whl file.

Looking at the trace logs this is the region of code I’m in: https://github.com/astral-sh/uv/blob/043d72646d37a5d740edb2c68ebd26a78dc5eb08/crates/uv-distribution/src/distribution_database.rs#L161


uv version

uv --version
uv 0.1.14

OS

uname -a
Linux ip-10-1-1-198 5.15.0-1052-aws #57~20.04.1-Ubuntu SMP Mon Jan 15 17:04:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

About this issue

  • Original URL
  • State: closed
  • Created 4 months ago
  • Comments: 28 (26 by maintainers)

Commits related to this issue

Most upvoted comments

I tested the fix on my slow large wheel and it works great. Installation time is now fast for that 1 wheel and I guess my index server had same issue here. Thanks to both of you for continuing to improve uv’s performance.

By the way, I can get this released pretty quickly.

Makes sense. Looks like I could even check this myself by reading BENCHMARKS.md and going from there?

Think this is a enough to make a PR. It’d be two things:

  1. setting identity, in the spirit of https://github.com/pypa/pip/pull/1688
  2. increasing buffer from 8KiB to 128KiB.

I think this is another content-encoding related issue.

I added some extra trace! logs into hyper when it was running encode_headers and deciding on the encoding strategy.

uv

Server::encode head=MessageHead { version: HTTP/1.1, subject: 200, headers: {"content-encoding": "gzip"}, extensions: Extensions }, status=200, body=Some(Unknown), req_method=Some(GET)

pip

Server::encode head=MessageHead { version: HTTP/1.1, subject: 200, headers: {"content-length": "755552143"}, extensions: Extensions }, status=200, body=Some(Unknown), req_method=Some(GET)

Seeing gzip in the headers cued me to think this was just the same issue as https://github.com/astral-sh/uv/pull/1978 showing up in a different place.

By doing the same header change that’s done in https://github.com/astral-sh/uv/pull/1978 torch installs much, much faster against the mirror!

image

A 15.64s second download against the mirror is still slower than what uv gets against PyPI (around 10s) so there may be some additional change I can make to get performance of our mirror to parity with PyPi. It really should be faster as it’s serving over the local network.

Checking the pip repository for its history of dealing with content-encoding I see this https://github.com/pypa/pip/pull/1688

We don’t use range requests for the wheel download IIRC, just for fetching wheel metadata, so hopefully not a factor here.