sqlalchemy: add execution option for asyncpg_timeout

Describe the use case

In our company we use PostgreSQL, one of the tables that is quite big with over 1.5B records. We have few queries constructed with SQLAlchemy Core which we use on it and with help of indexes they’re usually quite fast (100’s of milliseconds at most). Lately we have discovered and issue with PG’s query planner that would omit an index and would run a plan that was stalling the connection for multiple days. After running a particular API request many times our connection pool got exhausted and application basically crashed.

Natural solution to prevent this type of issues would be to implement some sort of timeout. Value of the timeout would depend on the expected execution time of given query. There are of course multiple ways to achieve this:

Set statement_timeout on current connection in case of PostgreSQL.
Wrap AsyncConnection.execute() in asyncio.wait_for().
Add option to pass timeout argument to asyncpg driver via for example execution_options argument.

Where 1. and 2. seems to be possible options I think it would be very useful if option 3. would be implemented in SQLAlchemy. Timeout mechanism is already built in in asyncpg driver but access to it would need to be added.

Databases / Backends / Drivers targeted

asyncpg for PostgreSQL

Example Use

try:
    row = await conn.execute(query, execution_options={'timeout': 3})
except TimeoutError:
    ... # handle case when query took to long to execute

Additional context

No response

About this issue

Original URL
State: open
Created 2 years ago
Reactions: 1
Comments: 18 (15 by maintainers)

Most upvoted comments

If setting it globally is ok, you can use the classic connect_args to set any arg that’s accepted by asyncpg, documented here https://magicstack.github.io/asyncpg/current/api/index.html#asyncpg.connection.connect.

So for example create_async_engine(..., connect_args={"timeout": 42})

CaselIT on May 25, 2023

we could have an internal feature that takes all “drivername_xyz” options and moves them into separate dictionaries like that. We do this already for the DDL arguments like Index(postgresql_where=…). it just would be an extra few dictionary operations per execute call which is likely not worth it unless it’s in cython.

zzzeek on Jan 24, 2022

can I summarize your proposal as:

yes.

Your reasoning makes sense. Since something like this is not already implemented, changes are that there isn’t a general need for it.

CaselIT on Jan 24, 2022

can I summarize your proposal as:

conn.execute(..., execution_options={"asyncpg_execute_kw": {"timeout": 60}})

instead of

conn.execute(..., execution_options={"asyncpg_timeout": 60})

i havent looked at psycopg3 yet, and maybe they are an outlier, but the general idea of per-execute KWs is not really prevalent in all the drivers I’ve dealt with. asyncpg here only has “timeout”, that’s it. all other drivers I’ve worked with have none, ever, since it’s not part of the pep-249 interface.

this is not to say there can’t be some kind of namespacing like the above for delivering lots of dialect-specific options, which might be fine, but the above is not going to solve the issue of a user wants to use some new dialect feature X and we don’t have to change any code, because those features are usually implemented in some idiosyncratic way, as a cursor.execute() kw arg is the last place such things are added since pep-249 doesnt include it, and for asyncpg we aren’t using their “cursor” concept in most cases anyway.

that is I think “timeout” especially in an async context is one that makes sense, but as the other drivers dont even seem to have it, seems like we should not overthink this one and just make the feature available, there are not expected to really be any other “execute” arguments like this (unless psycopg3 is really a game changer in this area).

zzzeek on Jan 24, 2022

let me just emphasize that we arent going to add any “timeout” mechanisms on our end using async workers, threads etc. , someone else had asked about that. this is strictly timeout params that the async (or sync) drivers accept explicitly in some way on a per-statement basis.

zzzeek on Jan 23, 2022