full-stack-fastapi-template: sqlalchemy queue pool limit lockup/timeout

While doing some testing to see how this fastapi with sqlalchemy would hold up, my server seemed to lock up when running 100 concurrent requests. If i ran the requests sequentially it was totally fine.

sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached,

Is it possible I’ve made some error while trying to mimic the structure of the code base? Or is it possible the main can lock up in the middleware with the session implementation of sqlalchemy?

Has anyone tested the performance of this cookie cutter project?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 18
  • Comments: 62 (3 by maintainers)

Commits related to this issue

Most upvoted comments

@vumaasha can you please try solution I described above and see if it helps?

I make heavy use of Depends, getting rid of Depends will need lot of refactoring for me. So it is not an option.

I only tried the part corresponding to setting pool size. It didn’t help.

@jklaw90

That’s interesting! I have slightly different setup for my app and I couldn’t reproduce the error you have, but occurred with another one. I don’t have middleware and don’t use Depends to get session. Instead, I create session directly in the endpoint.

I refactor my app to use Depends and have exactly the same error you got. Strangely enough, but with PYTHONASYNCIODEBUG=1 everything works fine.

So, I guess the problem might be in how Depends work, although I haven’t looked at it yet. You might want to try to get rid of the session middleware and Depends and use custom context manager for session instead, something like:

from contextlib import contextmanager
from sqlalchemy.orm import sessionmaker

Session = sessionmaker(...)

@contextmanager
def SessionManager():
    session = Session()
    try:
        yield session
    finally:
        session.close()

And then you can use it in you endpoint like that:

@router.get("/")
def hello_world():
    with SessionManager() as db_session:
        ...

So, with a setup like that I also tried to fire 100 concurrent requests and got this error:

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL:  sorry, too many clients already

However, if I set WEB_CONCURRENCY=1 (this env specifies number of workers for gunicorn - in this case it will run gunicorn with a single worker), everything is fine.

I think I got this error, because gunicorn spawns several workers (you can see how the number of worker is calculated here). Each worker thinks it has pool size of 100 available to it, when it is actually db_pool_size / number_of_workers:

So as a fix I created engine like this:

DB_POOL_SIZE = int(os.getenv("DB_POOL_SIZE", "100"))
WEB_CONCURRENCY = int(os.getenv("WEB_CONCURRENCY", "2"))
POOL_SIZE = max(DB_POOL_SIZE // WEB_CONCURRENCY, 5)

engine = create_engine(config.DATABASE_DSN, pool_size=POOL_SIZE, max_overflow=0)

Note: WEB_CONCURRENCY calculated a little bit more complicated

The problem is gone and everything seems to work fine

Hi,

Thanks for the discussion everyone!

This was most probably solved in tiangolo/fastapi#5122, released as part of FastAPI 0.82.0 🎉

If this solves it for you, you could close the issue. 🤓

So, can we use the dependency injection itself(Depends(get_db)), if the version is updated to 0.82.0 or above?

@ebreton What was the default connection limit on that postgres container and did you also increase the pool size in sqlalchemy? I understand that would fix this issue, but the main shouldn’t get blocked and timeout after 30 seconds in any condition with only 100 concurrent requests. Running all 100 requests sequentially works perfectly fine and finishes in no time.

I don’t know fastapi/starlette and sqlachemy that well so I’m a little confused if the main is grabbing a session but blocking when the QueuePool is full? If so is it possible we could see better results by just changing the sqlalchemy configurations?

@AlexanderPodorov and @sandys I’ve tried with SQLAlchemy and Asyncio and worked like a charm.

I’m going to write a small example and upload to my github account in the next days.

I’ll let you know.

@unmade @stratosgear i have a theory

what if you keep Depends…but put it under Contextmanager. So for example


from typing import List
from contextlib import contextmanager

from fastapi import Depends, FastAPI, HTTPException
from sqlalchemy.orm import Session

from . import crud, models, schemas
from .database import SessionLocal, engine

models.Base.metadata.create_all(bind=engine)

app = FastAPI()


# Dependency
@contextmanager
def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()


@app.post("/users/", response_model=schemas.User)
def create_user(user: schemas.UserCreate, db: Session = Depends(get_db)):
   with db:
       db_user = crud.get_user_by_email(db, email=user.email)
       if db_user:
         raise HTTPException(status_code=400, detail="Email already registered")
       return crud.create_user(db=db, user=user)

maybe the whole issue is happening because contextmanager is needed to close the connection before the async corountine exits.

Sweet! I have no extra overhead with the SessionManager, after rerunning my load testing scenarios.

One other configuration change I have is when I instantiate the sqlachemy engine:

engine = create_engine(
    config.SQLALCHEMY_DATABASE_URI, pool_pre_ping=True, pool_size=32, max_overflow=64
)

That ought to be able to utilize all 100 default Postgres connections, but my load tests anyway rarely seem to utlize more than 25-30 connections (now with the SessionManager)

🍺 to @unmade (or 🍰, 🍵 or 🥃 or whatever rocks his boat…)

Ahh thank you for the resources and insight!

I am still a little confused on why even with default 100 connections + pool size of 5 we would get timeouts rather than just getting responses slowly?

If there is some simple fix i’d love to know, really looking to switch from falcon to fastapi.

Thanks for the discussion everyone!

This was most probably solved in https://github.com/tiangolo/fastapi/pull/5122, released as part of FastAPI 0.82.0 🎉

If this solves it for you, you could close the issue. 🤓

I have a fully working, sqlalchemy and pydantic compatible gist here with both syncpg and asyncpg.

https://gist.github.com/sandys/671b8b86ba913e6436d4cb22d04b135f

You can try it and let me know. I have already benchmarked it. So should work fine.

hi Anderson, this looks pretty cool. Did you try this - https://gist.github.com/sandys/671b8b86ba913e6436d4cb22d04b135f ?

The one big difference is that I use the new sqlalchemy dataclasses and avoid the double declaration of sqlalchemy vs pydantic models.

I also have included various complex model features like indexes, foreign keys, mixins etc

regards sandeep

On Mon, Apr 12, 2021 at 5:53 PM Anderson Rocha @.***> wrote:

Hey guys, I made a benchmark using SQLAlchemy Sync and Async.

The code and the results are in the link below:

https://github.com/andersonrocha0/fastApiAndSqlAlchemySyncVsAsync

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tiangolo/full-stack-fastapi-postgresql/issues/104#issuecomment-817765913, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAASYU4EII4BWOVWGGBF6ZLTILQ53ANCNFSM4KM62SJA .

Hey guys, I made a benchmark using SQLAlchemy Sync and Async.

The code and the results are in the link below:

https://github.com/andersonrocha0/fastApiAndSqlAlchemySyncVsAsync

Sure…

So I got rid of the Depends(get _db) in the API method and I am using:

@router.get("/cleanup/{jobid}", response_model=schemas.Job, status_code=200)
def cleanup(response: Response, jobid: int):
    """ some comment"""

    with SessionManager() as db:
        jobRow = crud.job.get(db=db, id=jobid)
....
....

so my session is provided by SessioManager as db and I used that from that point on.

SessionManager, pulled straight out of my repo, is externally defined as defined as:

import warnings
from contextlib import contextmanager

from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker

from rapidxmm.core import config

# How to echo all Sqlalchemy sql statements
# engine = create_engine(config.SQLALCHEMY_DATABASE_URI, pool_pre_ping=True, echo=True)

# Increase connection pool sizes to support more than 100 concurrent users.
# After some failed tests while testing 100 conc.users an a DB with
# pointed at nside 15 and slew at nside 14
# defaults were: pool_size=5, max_overflow=10
# https://docs.sqlalchemy.org/en/13/errors.html#error-3o7r
engine = create_engine(
    config.SQLALCHEMY_DATABASE_URI, pool_pre_ping=True, pool_size=32, max_overflow=64
)
db_session = scoped_session(
    sessionmaker(autocommit=False, autoflush=False, bind=engine)
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)


# SessionManager is an alternative way to get a DB connection
# as an alternative to FasTApi Depends that seems to quickly
# exhaust the Postgres open connections:
# https://github.com/tiangolo/full-stack-fastapi-postgresql/issues/104
@contextmanager
def SessionManager():
    db = SessionLocal()
    try:
        yield db
    except:
        # if we fail somehow rollback the connection
        warnings.warn("We somehow failed in a DB operation and auto-rollbacking...")
        db.rollback()
        raise
    finally:
        db.close()

Embarrassingly enough I do not know the exact gunicorn settings. I am using the standard main.py file for instantiating my App and I let it boot up with the default settings as they are defined in this cookiecutter skeleton repo… 😦

Hi @thebleucheese , thanks for your feedback on @sandys code. The documentation discourages decorating Dependencies with @contextmanager/@asynccontextmanager here in the very last tip. It looks like this is because it breaks dependency resolution, based on your experience.

Do you have any recommendations on how to set up an Asyncio SQLAlchemy Session in production, based on your experiences?

@thebleucheese did u try my code ? would love any feedback if u have

Yes, sorry it took me a while to get back to this. I had to move on to other issues I had with my project on the front-end.

I’ve encountered an issue with this approach and using multiple dependencies that rely on the DB session.

The following works fine in isolation.

@app.get("/", response_model=List[UserPyd])
async def foo(context_session: AsyncSession = Depends(get_db)):
    async with context_session as db:
        # code that uses db (AsyncSession)

If I add a dependency on another dependency that uses get_db I get this error:

File “/usr/lib/python3.8/contextlib.py”, line 173, in aenter raise RuntimeError(“generator didn’t yield”) from None

async def bar(db_session: : AsyncSession = Depends(get_db)):
    async with db_session as db:
        # use db to check user permissions or some other action

@app.get("/", response_model=List[UserPyd])
async def foo(current_user: User = Depends(bar), context_session: AsyncSession = Depends(get_db)):
    async with context_session as db:
        # code that uses db (AsyncSession)

I’m not running the exact gist but the code is similar enough.

When I switch to a simplified setup without the contextmanager I don’t see this error:

engine = create_async_engine(
    DATABASE_URL,
    pool_pre_ping=True,
    echo=True
)
_async_session = sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)

async def get_db() -> AsyncSession:
    db: AsyncSession = _async_session()
    try:
        yield db
        await db.commit()
    except:
        await db.rollback()
    finally:
        await db.close()

@abrichr , to your questions: In regards to background_tasks, this is the standard FastApi feature. It allows you to add any function that will be executed after the response is sent. In my case I’ve added the close_session function and passed session object as positional argument. In regards to get_settings, this just returns my custom Settings class. In particular this class has session_maker attribute which I’m using to initiate a new DB session, actually returned by SQLAlchemy’s sessionmaker factory function. I found it useful to store SQLAlchemy’s Engine instance on my custom Settings class to take advantage on dependency injection features.

@vumaasha Where you able to solve your issue? If yes, how? As we are facing the same issue in our project.

The problem is there is a lockup/timeout on sqlalchemy. The session maker should block and wait, so slow responses would be acceptable over no responses. Updating the pool size just hides the underlying issue and isn’t a solution I would be willing to use in production.

@tiangolo we are also hitting the Depends issue in production. Is there a long-term production-ready fix here ?

@sandys this setting seem to help for me. I am using a higher number for max_overflow and smaller number for pool. Only pool size connections are maintained, other connections are released as soon as possible. This does seem to help to mitigate the problem

engine = create_engine(config.SQLALCHEMY_DATABASE_URI, pool_pre_ping=True, pool_size=30, max_overflow=120)

The problem is there is a lockup/timeout on sqlalchemy. The session maker should block and wait, so slow responses would be acceptable over no responses. Updating the pool size just hides the underlying issue and isn’t a solution I would be willing to use in production.