efcore: Scaffolding takes a long time to retrieve Indexes on large databases which causes timeouts and scaffolding failure

Reverse-engineering a sufficiently large database utilizes an inefficient query to retrieve indexes which causes the reveng to fail. The query is emitted here

Please note that I have a resolution ready to PR if this issue is deemed valid and reasonable.

Steps to reproduce

- Obtain a sufficiently large database (note: not the easiest) - Attempt to scaffold the context - ??? - Timeout

Further technical details

EF Core version: 3.1.7 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: Net Core 3.1 Operating system: Win10-x64 IDE: VS 2019 16.7.1

The good stuff

Straight up, our database is a monolith, but I understand we’re not alone here. Here is a snapshot of where we stand - image

The query to get Indexes, in all its glory, takes over a minute. Anything over 30 seconds is unacceptable given that’s the error condition.

By locating the SQL ran by SqlServerDatabaseModelFactory.GetIndexes() through MS SQL Profiler and running it myself, the following time is taken: image

Inspecting the execution plan leads to: image

And the top-right step of the plan reveals image

Over 120 million records read for the tight loop here.

Resolution

In my mind I see three resolution paths.

Easiest

Extend the timeout of the DbCommand(s) created such that scaffolding is given a reasonable amount of time to complete (how long is reasonable?)

Easy

Refactor the query. I have already done so, with the following result, having compared the before/after result for equality. image

Medium

Refactor Microsoft.EntityFrameworkCore.Design.Internal.DatabaseOperations to use DbContextOperations to retrieve the IDesignTimeDbContextFactory<TContext> from the user.* This allows the user to set arbitrary timeouts by utilizing the DbConnection created by their context, and allow users to specify startup assemblies instead of a connection string. Note, I have tried to grock through this code within a reasonable amount of time, but to the best of my understanding it does not already take into account the user’s design-time factory (source), while MigrationsOperations does here.


If you guys feel like this is a valid enough issue, I can submit a PR that modifies the query issued by GetIndexes(), which should speed up this process immensely for larger databases while leaving smaller databases unaffected.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 34 (26 by maintainers)

Most upvoted comments

Thanks for the additional informaton, sadly 40 seconds is still above the default of 30 seconds.