registry: Registry server: slow listing of artifacts under revisioned resources in large collections.

Quoting a benchmark result listed in #1068, listing of spec artifacts can be significantly slowed when there are large numbers of specs:

BenchmarkListArtifacts/ListApiArtifacts-16         	       3	  435709811 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       3	  455032674 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       1	72584223258 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       1	17589723714 ns/op

This is with the N=1000 test set running with a server that already has a large collection of API specs (~13000).

The impact of this is that it is difficult to get or delete spec artifacts across a collection, eg. registry get apis/-/versions/-/specs/-/artifacts/summary -o yaml or registry delete apis/-/versions/-/specs/-/artifacts/vocabulary. Both of these are uncomfortably slow for large registries.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (15 by maintainers)

Most upvoted comments

You’ve added api_id to deployments and specs… I wonder if adding api_id to artifacts as well could be useful as a pretty common join? Also, since we often sort on api_id - I wonder if adding sort to our index definitions might be useful.

Also, let’s talk about (or note in a new issue) how adding more column indexes would help updates - I think updates are using primary keys (the resource names) that are already indexed, but that could be missing something.

I think we’re good. I wasn’t thinking about the name being used as primary key.

Also, can you show me (or write a note about) how to use explain, etc to evaluate these queries? I’m guessing that it invovles (but I haven’t figured out how to) configuring gorm to write the queries that it’s making and then running them directly against Postgres.

I believe you can just set our debug to trace level to get the SQL. Then, run the explain against postgres using pgadmin or whatever tool.

This is with just api_id indexes on specs and deployments.

This first run was after I had created the indexes and run for a while, but I deleted the api_id index on specs and benchmarked. Without the api_id index on specs, the spec listing was clearly slow:

BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 587547180 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       2	 551491554 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       1	42355723862 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 560716006 ns/op

Then I recreated the index with registry rpc admin migrate-database --follow. This returned immediately, but I think the index might have still been building. The immediate next benchmark was still slow, but the following two were good.

BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 581270234 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       2	 569786582 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       1	28441528306 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 547090102 ns/op
BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 530105451 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       2	 841461730 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       2	 590193080 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 556028792 ns/op
BenchmarkListArtifacts/ListApiArtifacts-16         	       2	 648811879 ns/op
BenchmarkListArtifacts/ListVersionArtifacts-16     	       3	 466749284 ns/op
BenchmarkListArtifacts/ListSpecArtifacts-16        	       2	 615581775 ns/op
BenchmarkListArtifacts/ListDeploymentArtifacts-16  	       2	 557018492 ns/op

This was with 8800 specs in the database.