requests: Server redirect containing neither body nor content-length header can result in delays/timeouts

When requests v2.28.1 follows an HTTP redirect, it attempts to consume the content of the server response, and then closes the response, allowing the socket to be re-used for other requests.

When that occurs for a server that sends an HTTP redirect (say, HTTP 302) without a content body – and also without a Content-Length header (seemingly an optional header, particularly for responses that do not contain a response body), then requests can get stuck – and/or timeout if a timeout is configured – attempting to read the body contents (more specifically: when iterating the body contents).

Expected Result

I’m not an expert in this area, so I’m not completely certain. I feel like it might be possible to detect the absence of a content-length header field during the redirect logic, and also perhaps attempt to read the content (to empty the socket buffer), but perhaps with an immediate-skip if the content-length is missing (meaning: the server didn’t say there is any body content, and there doesn’t seem to be any body content from a brief check, so let’s consider the content consumed already).

Actual Result

When a timeout is configured for the request, then that timeout will be reached.

If no timeout is configured, then the request may remain open indefinitely.

Reproduction Steps

# Sample code derived from: https://docs.python.org/3/library/http.server.html#http.server.SimpleHTTPRequestHandler and sphinx: https://github.com/sphinx-doc/sphinx.git
from contextlib import contextmanager
from http.server import SimpleHTTPRequestHandler, ThreadingHTTPServer
from threading import Thread

import requests


class RedirectHandler(SimpleHTTPRequestHandler):
    protocol_version = "HTTP/1.1"

    def do_GET(self):
        if self.path == "/origin":
            self.send_response(302, "Found")
            self.send_header("Location", "http://127.0.0.1:8000/destination")
            self.end_headers()

        if self.path == "/destination":
            self.send_response(200, "OK")
            self.end_headers()


class RedirectServer(Thread):
    def __init__(self):
        super().__init__()
        self.server = ThreadingHTTPServer(("127.0.0.1", 8000), RedirectHandler)

    def run(self):
        self.server.serve_forever(poll_interval=0.001)

    def close(self):
        self.server.shutdown()


@contextmanager
def redirect_server():
    server = RedirectServer()
    server.start()
    try:
        yield server
    finally:
        server.close()


with redirect_server():
    requests.get(url="http://127.0.0.1:8000/origin", allow_redirects=True, timeout=1)

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "5.1.0"
  },
  "charset_normalizer": {
    "version": "3.0.1"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "3.3"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.11.2"
  },
  "platform": {
    "release": "6.1.0-7-amd64",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.28.1"
  },
  "system_ssl": {
    "version": "30000080"
  },
  "urllib3": {
    "version": "1.26.12"
  },
  "using_charset_normalizer": false,
  "using_pyopenssl": false
}

Edit: use a requests.Session object in the sample code Edit #2: undo the previous edit

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Hi @jayaddison,

If the server doesn’t send an appropriate Content-Length or Transfer-Encoding header in the response, it’s difficult for clients to know when the response is complete.

What Requests is doing here is correct to spec. We will try to read until the server closes the connection. I don’t think we want to broadly change this as it’s likely to break existing cases.

Reference RFC 7230 (This should also be reflected in 9110)

  1. Otherwise, this is a response message without a declared message body length, so the message body length is determined by the number of octets received prior to the server closing the connection.

It’s a response, not a request message.

I agree with @nateprewitt heartily. As such I’m inclined to close this