openai-python: Reusing AsyncOpenAI client results in openai.APIConnectionError

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

Reusing an instance of AsyncOpenAI client for multiple calls of asyncio.gather results in an openai.APIConnectionError. Retried requests (either via the openai library directly or backoff decorator) succeed, but the first try of the second use of the client always fails.

I suspect that this usage of AsyncOpenAI is not ideal, but the behavior nonetheless feels buggy. Even if the reuse of the client should fail, I’m confused understand why retries succeed.

Bizarrely, in my application all retries after the initial openai.APIConnectionError result in unending openai.APITimeoutError instead of success, but I am unable to repro this outside of the application. However, I strongly suspect that the issue is related as reusing the client solves both the initial error as well as the timeouts.

To Reproduce

  1. Create an instance of AsyncOpenAI with no retries enabled
  2. Use AsyncOpenAI().chat.completions.create to create a list of Future objects (any number will do)
  3. Use asyncio.gather to get the results of the API calls
  4. Executes steps 3+4 again - this causes the error

Code snippets

import asyncio
import openai
from openai import AsyncOpenAI
import httpx
import backoff

print(f"OpenAI version: {openai.__version__}")

OPENAI_API_KEY = "redacted"

api_params = {
    "temperature": 0.2,
    "max_tokens": 500,
    "model": "gpt-3.5-turbo-1106",
}

messages = [{"role": "user", "content": "What is the capital of Quebec?"}]


@backoff.on_exception(
    backoff.expo,
    (
        openai.RateLimitError,
        openai.APIStatusError,
        openai.APIConnectionError,
        openai.APIError,
        openai.APITimeoutError,
        openai.InternalServerError,
    ),
)
async def create_request_retry(client, messages, api_params):
    return await client.chat.completions.create(messages=messages, **api_params)


async def create_request_no_retry(client, messages, api_params):
    return await client.chat.completions.create(messages=messages, **api_params)


# No retries, new client for each set of requests - succeeds
def succeed1():
    for i in range(2):
        client = AsyncOpenAI(
            api_key=OPENAI_API_KEY,
            timeout=10.0,
            http_client=httpx.AsyncClient(limits=httpx.Limits(max_keepalive_connections=500, max_connections=100)),
            max_retries=0,
        )
        arequests = []
        for _ in range(5):
            arequests.append(create_request_no_retry(client, messages, api_params))
        responses = asyncio.run(asyncio.gather(*arequests))
        results = [response.choices[0].message.content for response in responses]
        print(f"{i}: {results}")


# Retry using backoff decorator, reuse client - succeeds
def succeed2():
    client = AsyncOpenAI(
        api_key=OPENAI_API_KEY,
        timeout=10.0,
        http_client=httpx.AsyncClient(limits=httpx.Limits(max_keepalive_connections=500, max_connections=100)),
        max_retries=0,
    )
    for i in range(2):
        arequests = []
        for _ in range(5):
            arequests.append(create_request_retry(client, messages, api_params))
        responses = asyncio.run(asynciogather(*arequests))
        results = [response.choices[0].message.content for response in responses]
        print(f"{i}: {results}")


# Retry using openai library, reuse client - succeeds
def succeed3():
    client = AsyncOpenAI(
        api_key=OPENAI_API_KEY,
        timeout=10.0,
        http_client=httpx.AsyncClient(limits=httpx.Limits(max_keepalive_connections=500, max_connections=100)),
        max_retries=2,
    )
    for i in range(2):
        arequests = []
        for _ in range(5):
            arequests.append(create_request_no_retry(client, messages, api_params))
        responses = asyncio.run(asynciogather(*arequests))
        results = [response.choices[0].message.content for response in responses]
        print(f"{i}: {results}")


# No retries, reuse client - fails
def error():
    client = AsyncOpenAI(
        api_key=OPENAI_API_KEY,
        timeout=10.0,
        http_client=httpx.AsyncClient(limits=httpx.Limits(max_keepalive_connections=500, max_connections=100)),
        max_retries=0,
    )
    for i in range(2):
        arequests = []
        for _ in range(5):
            arequests.append(create_request_no_retry(client, messages, api_params))
        responses = asyncio.run(gather(*arequests))
        results = [response.choices[0].message.content for response in responses]
        print(f"{i}: {results}")

OS

macOS

Python version

Python 3.11.5

Library version

openai v1.6.1

About this issue

  • Original URL
  • State: closed
  • Created 6 months ago
  • Reactions: 1
  • Comments: 17

Most upvoted comments

I can report the same exact issue. I instantiate a global AsyncOpenAI client. I issue a number of calls async, wait for the response with a gather. Then do the same thing over again and the requests hang forever, never coming back. I have a timeout built into my code. But retries are futile.

Can confirm that encode/httpcore#880 fixes the issue. To reproduce,

pip3 uninstall httpcore && \
    pip3 install git+https://github.com/encode/httpcore.git@clean-state-cancellations

And run the above pytest tests, and it seem to pass.

@dumbPy Async is sometimes tricky with pytest so it would be nice indeed to have this reproduced without pytest, just to have less moving parts and a more straightforward failing case.

I believe that our suspicion is that this relates to https://github.com/encode/httpcore/issues/830 which httpx maintainer @tomchristie is currently working on.