efcore: Eager Loading Many to Many Relationship Doesn't Use the Index
File a bug
Okay, so I’m using SQLite. That might matter because DBMS’s are pointlessly different just for the sake of it.
I have a many to many relationship declared using List<Entity> in each class. That generally works, but I know it’s newer.
EF Core generates a query that SQLite doesn’t use an index search for, and instead it uses a scan, which is slow.
Include your code
public class Image
{
public int ImageId { get; set; }
public int Width { get; set; }
public int Height { get; set; }
public virtual List<ImageTag> Tags { get; set; }
}
public class ImageTag
{
public int ImageTagId { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public string Type { get; set; }
[IgnoreDataMember]
public virtual List<Image> Images { get; set; }
}
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
// nullability
modelBuilder.Entity<Image>().Property(a => a.Width).IsRequired();
modelBuilder.Entity<Image>().Property(a => a.Height).IsRequired();
modelBuilder.Entity<ImageTag>().Property(a => a.Name).IsRequired();
// indexes
modelBuilder.Entity<Image>().HasIndex(a => a.Width);
modelBuilder.Entity<Image>().HasIndex(a => a.Height);
modelBuilder.Entity<ImageTag>().HasIndex(a => a.Name).IsUnique();
modelBuilder.Entity<ImageTag>().HasIndex(a => a.Type);
}
Query
var namesToLookup = new List<string> { "a", "b", "c" };
var existingTags = await Set<ImageTag>().Include(a => a.Images)
.Where(a => namesToLookup.Contains(a.Name)).ToListAsync(token);
So that generates a query like so:
SELECT "i"."ImageTagId", "i"."Description", "i"."Name", "i"."Type", "t"."ImagesImageId", "t"."TagsImageTagId", "t"."ImageId", "t"."Height", "t"."Width"
FROM "ImageTags" AS "i"
LEFT JOIN (
SELECT "i0"."ImagesImageId", "i0"."TagsImageTagId", "i1"."ImageId", "i1"."Height", "i1"."Width"
FROM "ImageImageTag" AS "i0"
INNER JOIN "Images" AS "i1" ON "i0"."ImagesImageId" = "i1"."ImageId"
) AS "t" ON "i"."ImageTagId" = "t"."TagsImageTagId"
WHERE "i"."Name" IN ('a', 'b', 'c')
ORDER BY "i"."ImageTagId", "t"."ImagesImageId", "t"."TagsImageTagId", "t"."ImageId"
Which explains to this
MATERIALIZE 1
SCAN TABLE ImageImageTag AS i0
SEARCH TABLE Images AS i1 USING INTEGER PRIMARY KEY (rowid=?)
SEARCH TABLE ImageTags AS i USING INDEX IX_ImageTags_Name (Name=?)
SEARCH SUBQUERY 1 AS t USING AUTOMATIC COVERING INDEX (TagsImageTagId=?)
USE TEMP B-TREE FOR ORDER BY
And because of the SCAN TABLE, it’s really slow, like minutes, which would be okay if it wasn’t a really easy fix.
SELECT "i"."ImageTagId", "i"."Description", "i"."Name", "i"."Type", "i0"."ImagesImageId", "i0"."TagsImageTagId", "i1"."ImageId", "i1"."Height", "i1"."Width"
FROM "ImageTags" AS "i"
INNER JOIN "ImageImageTag" AS "i0" ON "i0"."TagsImageTagId" = "i"."ImageTagId"
INNER JOIN "Images" AS "i1" ON "i0"."ImagesImageId" = "i1"."ImageId"
WHERE "i"."Name" IN ('a', 'b', 'c')
ORDER BY "i"."ImageTagId", "i0"."ImagesImageId", "i0"."TagsImageTagId", "i1"."ImageId"
Explains to
SEARCH TABLE ImageTags AS i USING INDEX IX_ImageTags_Name (Name=?)
SEARCH TABLE ImageImageTag AS i0 USING INDEX IX_ImageImageTag_TagsImageTagId (TagsImageTagId=?)
SEARCH TABLE Images AS i1 USING INTEGER PRIMARY KEY (rowid=?)
USE TEMP B-TREE FOR ORDER BY
That’s fast. SQLite Studio just says [00:28:36] Query finished in 0.000 second(s).
So this issue is twofold.
- Is it like that for a (good) reason?
- Is there a workaround that doesn’t require me to manually define my intermediate tables or execute raw sql?
I have a several hundred gig database already built, so I’m not running a migration to define the tables myself to then try to manually build a better query in LINQ. I’m not running raw sql because that defeats the purpose. If I was willing to hardcode sql, I wouldn’t be using an ORM with so many known quirks. No offense, it’s still by far the best option.
Sorry for not following the template exactly, but I’ve seen this reproduced in other issues. They just didn’t complain about it, which makes me think it may be an SQLite <-> EFCore quirk. It’s not a specific bug, but more of a logic issue.
Include provider and version information
EF Core version: 6.0.0-preview4… System.Data.Sqlite.Core is the same version Database provider: SQLite Target framework: net5.0
I would be willing to use net6.0, but this app runs on linux, and installing the net6.0 preview tools for linux is still a pita.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 19 (8 by maintainers)
This is something out-of-scope for EF to do automatically as determining whether this optimization would be appropriate for a given query and model is not trivial and might depend on the data and user’s intent.