asyncpg: Slow introspection when using multiple custom types in a query
- asyncpg version: 0.20.1
- PostgreSQL version: 12.1
- Do you use a PostgreSQL SaaS? If so, which? Can you reproduce the issue with a local PostgreSQL install?: postgresql is installed locally on archlinux
- Python version: 3.7.6 and 3.8.1
- Platform: Linux 5.4.13-arch1-1
- Do you use pgbouncer?: no
- Did you install asyncpg with pip?: yes
- If you built asyncpg locally, which version of Cython did you use?: N/A
- Can the issue be reproduced under both asyncio and uvloop?: yes
I have a few custom types (CREATE TYPE ...
) in my database, and I’m running into issues where asyncpg’s introspection stage on queries using 2 or more of these types are taking > 1 second to complete.
e.g.
add log_min_duration_statement = 500
to the default postgresql.conf
Create a database test
with schema:
CREATE TYPE FOURBIGINTS AS (i0 BIGINT, i1 BIGINT, i2 BIGINT, i3 BIGINT);
CREATE TYPE NUMBERS_AS_WORDS AS ENUM (
'zero', 'one', 'two', 'three', 'four'
);
CREATE TABLE bigthings (
thing_id BIGSERIAL PRIMARY KEY,
num FOURBIGINTS,
words NUMBERS_AS_WORDS
);
and run the following code:
import asyncio
import asyncpg
async def main():
con_args = {
'dsn': 'postgres:///test',
'ssl': None,
'min_size': 10,
'max_size': 10
}
pool = await asyncpg.create_pool(**con_args)
async with pool.acquire() as con:
await con.execute(
"""\
INSERT INTO bigthings
(num, words)
VALUES ($1::FOURBIGINTS, $2::NUMBERS_AS_WORDS)
""",
(0, 0, 0, 0), 'one')
asyncio.run(main())
watching the logs (on my system sudo journalctl -u postgresql -f
, will show something like:
2020-01-21 14:45:49.066 CET [118544] LOG: duration: 1593.511 ms execute __asyncpg_stmt_2__: WITH RECURSIVE typeinfo_tree(
oid, ns, name, kind, basetype, has_bin_io, elemtype, elemdelim,
range_subtype, elem_has_bin_io, attrtypoids, attrnames, depth)
AS (
SELECT
...
2020-01-21 14:45:49.066 CET [118544] DETAIL: parameters: $1 = '{16584,16582}'
I’ve traced this back to the call to _introspect_types
in connection.py.
From a bit of testing, it only happens if there are multiple custom types used in the query. e.g. if i change the query to simply be INSERT INTO bigthings (num) VALUES ($1::FOURBIGINTS)
, then everything is nice and fast as expected, or if i change the bigthings.words
column to a VARCHAR
, then there is no problem. But as soon as I include two or more custom types (e.g. 2 enum types or 2 tuple types, or a mix) then I see the slow downs.
Is there anything I can do to either remove the need for this introspection (e.g. giving asyncpg some hints about these types), or maybe I’m doing something wrong that I can correct?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 8
- Comments: 21 (6 by maintainers)
Commits related to this issue
- fix: excessive DB connection establishment delay and optimize GQL queries * refs MagicStack/asyncpg#530: apply "jit: off" option to DB connections - It is specified in `ai.backend.manager.models.ba... — committed to lablup/backend.ai-manager by achimnol 3 years ago
- feat: Upgrade SQLAlchemy v1.4 for native asyncio support (#406) * fix: a long-standing transaction error - It is now reproducible reliably with the new SQLAlchemy + asyncpg combination! - Also ... — committed to lablup/backend.ai-manager by achimnol 3 years ago
- feat: Upgrade SQLAlchemy v1.4 for native asyncio support (#406) * fix: a long-standing transaction error - It is now reproducible reliably with the new SQLAlchemy + asyncpg combination! - Also fi... — committed to lablup/backend.ai-manager by achimnol 3 years ago
- db: Add initial asyncio support In order to alleviate the serialisation caused by the single-threaded REST API frontend server we switch the respective database functions to asyncio. This allows the ... — committed to michaelweiser/PeekabooAV by michaelweiser 3 years ago
- db: Add initial asyncio support In order to alleviate the serialisation caused by the single-threaded REST API frontend server we switch the respective database functions to asyncio. This allows the ... — committed to michaelweiser/PeekabooAV by michaelweiser 3 years ago
- db: Add initial asyncio support In order to alleviate the serialisation caused by the single-threaded REST API frontend server we switch the respective database functions to asyncio. This allows the ... — committed to michaelweiser/PeekabooAV by michaelweiser 3 years ago
- db: Add initial asyncio support In order to alleviate the serialisation caused by the single-threaded REST API frontend server we switch the respective database functions to asyncio. This allows the ... — committed to michaelweiser/PeekabooAV by michaelweiser 3 years ago
- db: Add initial asyncio support In order to alleviate the serialisation caused by the single-threaded REST API frontend server we switch the respective database functions to asyncio. This allows the ... — committed to michaelweiser/PeekabooAV by michaelweiser 3 years ago
- db: Add initial asyncio support In order to alleviate the serialisation caused by the single-threaded REST API frontend server we switch the respective database functions to asyncio. This allows the ... — committed to scVENUS/PeekabooAV by michaelweiser 3 years ago
- Disable JIT while doing type introspection The misapplication of JIT to asyncpg introspection queries has been a constant source of user complaints. Closes: #530 Closes: #1078 Previously: #875, #794... — committed to MagicStack/asyncpg by elprans 3 years ago
- Disable JIT while doing type introspection (#1082) The misapplication of JIT to asyncpg introspection queries has been a constant source of user complaints. Closes: #530 Closes: #1078 Previousl... — committed to MagicStack/asyncpg by elprans 9 months ago
To turn of the jit with sqlalchemy 1.4.0b asyncpg …
Thanks @elprans that does seem to be the issue.
adding
jit = off
to thepostgresql.conf
made everything fast again.Since I can’t change the
postgresql.conf
in my deployed environment right now, I’ve worked around the issue in my own code for now by addingjit: off
to theserver_settings
when i create my pool:This is most likely the PostgreSQL’s new JIT that is slow on that query. Try turning it off (
jit = off
).I have a WIP patch to automatically disable JIT for the introspection query.
Yes, via
execute()
and without any arguments. Not sure if useful in your use case.Postgres does not send enough information to describe query output at the protocol level. In order to decode complex data types, like composite types, ranges, arrays, etc,
asyncpg
needs to introspect those types (recursively) to determine what they are. Normally, the introspection is done once per connection, so if you use a pool you only pay the cost once.I worked around this problem by registering all enums on connection creation:
Is it possible to perform introspection manually? Without waiting for the query to be performed? Right now we need to manually execute the query in order to cache introspection.
jit = off
does help with better performance of the introspection queries, but still there’s a significant delay for initial query.No,
introspect_type
is a very simple query and is not affected by the JIT pathology.or if you’re using GINO