mlflow: [BUG] No index on foreign keys, in postgres store
MLflow Roadmap Item
This is an MLflow Roadmap item that has been prioritized by the MLflow maintainers. We’re seeking help with the implementation of roadmap items tagged with the help wanted label.
For requirements clarifications and implementation questions, or to request a PR review, please tag @WeichenXu123 in your communications related to this issue.
System information
- Linux Ubuntu 18.04
- MLflow installed from pip
- MLFlow version 1.11:
- Python version 3.8.3
Describe the problem
This might not be true for other dialects, but Postgres doesn’t automatically index on FKs. It makes something as simple as querying all run details extremely slow, as the number of experiments / runs (the UI only allows to fetch results 100 at time).
Code to reproduce issue
The following has about 10k rows but essentially times out.
SELECT *
FROM runs
JOIN params on params.run_uuid = runs.run_uuid
JOIN metrics on metrics.run_uuid = params.run_uuid
WHERE runs.experiment_id = 30
Other info / logs
Query to get all the missing FKs (credit)
SELECT c.conrelid::regclass AS "table",
/* list of key column names in order */
string_agg(a.attname, ',' ORDER BY x.n) AS columns,
pg_catalog.pg_size_pretty(
pg_catalog.pg_relation_size(c.conrelid)
) AS size,
c.conname AS constraint,
c.confrelid::regclass AS referenced_table
FROM pg_catalog.pg_constraint c
/* enumerated key column numbers per foreign key */
CROSS JOIN LATERAL
unnest(c.conkey) WITH ORDINALITY AS x(attnum, n)
/* name for each key column */
JOIN pg_catalog.pg_attribute a
ON a.attnum = x.attnum
AND a.attrelid = c.conrelid
WHERE NOT EXISTS
/* is there a matching index for the constraint? */
(SELECT 1 FROM pg_catalog.pg_index i
WHERE i.indrelid = c.conrelid
/* the first index columns must be the same as the
key columns, but order doesn't matter */
AND (i.indkey::smallint[])[0:cardinality(c.conkey)-1]
OPERATOR(pg_catalog.@>) c.conkey)
AND c.contype = 'f'
GROUP BY c.conrelid, c.conname, c.confrelid
ORDER BY pg_catalog.pg_relation_size(c.conrelid) DESC;
What component(s), interfaces, languages, and integrations does this bug affect?
Components
-
area/artifacts: Artifact stores and artifact logging -
area/build: Build and test infrastructure for MLflow -
area/docs: MLflow documentation pages -
area/examples: Example code -
area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry -
area/models: MLmodel format, model serialization/deserialization, flavors -
area/projects: MLproject format, project running backends -
area/scoring: Local serving, model deployment tools, spark UDFs -
area/server-infra: MLflow server, JavaScript dev server - [ x]
area/tracking: Tracking Service, tracking client APIs, autologging
Interface
-
area/uiux: Front-end, user experience, JavaScript, plotting -
area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models -
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry -
area/windows: Windows support
Language
-
language/r: R APIs and clients -
language/java: Java APIs and clients -
language/new: Proposals for new client languages
Integrations
-
integrations/azure: Azure and Azure ML integrations -
integrations/sagemaker: SageMaker integrations -
integrations/databricks: Databricks integrations
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 24 (11 by maintainers)
There is not much difference here, since the primary key’s index is used (cf. details below).
both
btree
hash tree
none
table schema