pip: Skipping page https://pypi.org/simple/idna/ because of Content-Type: unknown

  • Pip version: 10.0.1
  • Python version: Python 3.5
  • Operating system: MacOS

Description:

We’ve been hitting an intermittent error in downloading packages from PyPI using pip 10.0.1. So far it seems to only hit our MacOS, python 3.5 CI system (not python 3.6 on MacOS, and not any versions on other OSes). The actual package varies, but we always get an error like

  Could not find a version that satisfies the requirement pyOpenSSL (from -r test-requirements.txt (line 4)) (from versions: )

or

  Could not find a version that satisfies the requirement pytest>=3.3 (from -r test-requirements.txt (line 1)) (from versions: )

We finally managed to trigger the error with pip’s -vvv flag enabled, and it said:

  Looking up "https://pypi.org/simple/idna/" in the cache
  Current age based on date: 31182
  Freshness lifetime from max-age: 600
  Freshness lifetime from request max-age: 600
  https://pypi.org:443 "GET /simple/idna/ HTTP/1.1" 304 0
  Skipping page https://pypi.org/simple/idna/ because of Content-Type: unknown
  Could not find a version that satisfies the requirement idna (from trio==0.4.0+dev) (from versions: )

(Full log here)

So to my ignorant eye, it looks like:

  • pip found a cache entry for the https://pypi.org/simple/idna/ page
  • It asked pypi if the cache was accurate, and pypi said “304 Not Modified”, i.e., yeah, go ahead, use your cache
  • Then pip somehow decided that… the cached entry, maybe, had “Content-Type: unknown” and freaked out?

I don’t know how to debug further, so… help?

(our bug: https://github.com/python-trio/trio/issues/508)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 3
  • Comments: 34 (28 by maintainers)

Commits related to this issue

Most upvoted comments

I think I found a temporary workaround for avoiding this bug on Jenkins slaves with multiple workers: define XDG_CACHE_HOME=$HOME/.cache/$EXECUTOR_NUMBER on master configuration.

This means that each executor will have a different cache directory so they would avoid clashing each other. As the number of executors is limited this would limit bit the increase in disk space needed.

I hope that my comment would not be considered a reason for not fixing this issue as soon as possible as this bug does really break CI/CD systems.

If I had a reliable way to reproduce this, I’d modify my pip so that when it happened, it dumped a lot more information about what exactly pip thinks it’s looking at (like, all the headers and body). I suspect that seeing the bad data would help us narrow down where it was coming from.