urllib3: Retry doesn't kick in for DNS errors

I’m not sure if I’m doing something wrong or if I’ve misinterpreted the documentation.

But I’m trying to handle intermittent issues that I have with AWS and domain failing to resolve.

I’m running a suite of selenium tests, 30 tests to be exact, with lots of moving parts: app hosted on AWS, tests executed by Bitbucket CI (aka pipelines) using docker-in-docker and docker-to-docker.

Intermittently, for a couple of, seemingly random from time to time, tests, I get a failure:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='pr-5fdb8ad6.dev.dporganizer.com', port=8001): Max retries exceeded with url: /users/login (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7feb677b1be0>: Failed to establish a new connection: [Errno -2] Name does not resolve'))

gistNow I know the app is running and is accessible, because the other tests work fine (they all hit the same endpoint).

I also have a little function that runs before any tests, that makes sure that the app is accessible via all of AWS name servers (for our domain). https://gist.github.com/BeyondEvil/6f75a446cac23832808c8bbaa8186a90

So, I figured I would look into urllib3’s Retry object and logic, to at least solve the symptoms - since troubleshooting this is a nightmare given all the moving parts.

So, this is what I did:

from requests.sessions import Session
from requests.adapters import HTTPAdapter, Retry

with Session() as session:
    retry = Retry(connect=5, backoff_factor=1.0)
    session.mount(
        base_url.split(':')[0],  # this will be `https://pr-5fdb8ad6.dev.dporganizer.com`
        HTTPAdapter(
            max_retries=retry
        )
    )
    response = session.request(**signature)

Well, the code works, the tests run, but there doesn’t seem to be any retries happening. The exception happens, and I’m expecting it to now be wrapped in MaxRetryError, but it’s not.

Requests is at 2.20.1 and urllib3 is 1.24.1 python 3.7.0

What am I doing wrong?

How can I validate that retries are actually happening?

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 21 (9 by maintainers)

Most upvoted comments

DNS resolution occurs from here within create_connection(). From what I’m reading it looks like the Retry class checks to see if the exception is a subclass of ConnectTimeoutError which NewConnectionError is subclassed from so in theory it’s accepting that the NewConnectionError is retry-able. There may be a test somewhere for this situation.

Yeah they should be wrapped by that exception. Your code looks correct on first glance. Report back your findings.

This could also be the problem:

>>> import socket
>>> isinstance(socket.gaierror, socket.error)
False

Maybe we should be catching SocketError and socket.gaierror within HTTPConnection._new_conn and transforming both into NewConnectionError?