registry: Registry server: slow listing of artifacts under revisioned resources in large collections.

Quoting a benchmark result listed in #1068, listing of spec artifacts can be significantly slowed when there are large numbers of specs:

BenchmarkListArtifacts/ListApiArtifacts-16         	       3	  435709811 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       3	  455032674 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       1	72584223258 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       1	17589723714 ns/op

This is with the N=1000 test set running with a server that already has a large collection of API specs (~13000).

The impact of this is that it is difficult to get or delete spec artifacts across a collection, eg. registry get apis/-/versions/-/specs/-/artifacts/summary -o yaml or registry delete apis/-/versions/-/specs/-/artifacts/vocabulary. Both of these are uncomfortably slow for large registries.

About this issue

Original URL
State: closed
Created a year ago
Comments: 15 (15 by maintainers)

Most upvoted comments

You’ve added api_id to deployments and specs… I wonder if adding api_id to artifacts as well could be useful as a pretty common join? Also, since we often sort on api_id - I wonder if adding sort to our index definitions might be useful.

theganyo on Mar 7, 2023

Also, let’s talk about (or note in a new issue) how adding more column indexes would help updates - I think updates are using primary keys (the resource names) that are already indexed, but that could be missing something.

I think we’re good. I wasn’t thinking about the name being used as primary key.

Also, can you show me (or write a note about) how to use explain, etc to evaluate these queries? I’m guessing that it invovles (but I haven’t figured out how to) configuring gorm to write the queries that it’s making and then running them directly against Postgres.

I believe you can just set our debug to trace level to get the SQL. Then, run the explain against postgres using pgadmin or whatever tool.

theganyo on Mar 6, 2023

This is with just api_id indexes on specs and deployments.

This first run was after I had created the indexes and run for a while, but I deleted the api_id index on specs and benchmarked. Without the api_id index on specs, the spec listing was clearly slow:

BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 587547180 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       2	 551491554 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       1	42355723862 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 560716006 ns/op

Then I recreated the index with registry rpc admin migrate-database --follow. This returned immediately, but I think the index might have still been building. The immediate next benchmark was still slow, but the following two were good.

BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 581270234 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       2	 569786582 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       1	28441528306 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 547090102 ns/op

BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 530105451 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       2	 841461730 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       2	 590193080 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 556028792 ns/op

BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 648811879 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       3	 466749284 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       2	 615581775 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 557018492 ns/op

This was with 8800 specs in the database.

timburks on Mar 6, 2023