graphql-platform: Overfetching problem when querying related entities and using intermediary projections

Following https://github.com/ChilliCream/hotchocolate/issues/2365#issuecomment-697727508 I have Book and Author domain classes + BookDto and AuthorDto classes, I want to expose an IQueryable<BookDto> for the books field on my query root type. So here is the GetBooks method on the Query class: It simply projects Book into BookDto and returns an IQueryable<BookDto> so that Hot Chocolate could then do its magic on it.

[UseSelection]
public IQueryable<BookDto> GetBooks([Service]AppDbContext dbContext)
{
    return dbContext.Books.Select(book => new BookDto
    {
        Name = book.Name,
        ...
        Author = new AuthorDto
        {
            Id = book.Author.Id,
            Name = book.Author.Name,
        }
    });
}

Now, when you run the following query:

{
  books {
    name
    author {
      name
    }
  }
}

You would probably expect the following SQL to ultimately get generated:

SELECT [b].[Name], [a].[Name]
      FROM [Books] AS [b]
      INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]

However, instead, Hot Chocolate causes this one to get generated by EF Core:

SELECT [b].[Name], [a].[Id], [a].[Name], [a].[Age]
      FROM [Books] AS [b]
      INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]

Which retrieves all the fields on the author (including Age and Id in this case), even though the query requested only the Name.

At first, I thought this must be an EF Core issue, but then I executed the following code:

var queryableDto = dbContext.Books.Select(book => new BookDto
{
    Name = book.Name,
    Author = new AuthorDto
    {
        Id = book.Author.Id,
        Name = book.Author.Name,
        Age = book.Author.Age
    }
});
// The code that you would expect Hot Chocolate will generate for the above GraphQL query:
queryableDto.Select(bDto => new
{
    Name = bDto.Name,
    Author = new
    {
        Name = bDto.Author.Name,
    }
}).ToList();

And I realized it works as expected, and generates the following SQL which only selects the requested column (author name in this case):

SELECT [b].[Name], [a].[Name]
      FROM [Books] AS [b]
      INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]

Therefore, this must be related to Hot Chocolate, I believe. This is really a serious problem for situations like mine when we don’t want to expose IQueryable<TheEntityFrameworkModelClass> but rather a different type.

I’m using Hot Chocolate version 10.5.2 And I’ve tried this with both EF Core version 3.1.8 and 5.0.0 RC.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 4
  • Comments: 67 (51 by maintainers)

Commits related to this issue

Most upvoted comments

@PascalSenn @michaelstaib has there been any progress on this issue?

It looks like there’s still an issue on deeply nested entities.

[UseProjection]
public IQueryable<BookDto> GetBooks([Service]AppDbContext dbContext)
{
    return dbContext.Books.Select(book => new BookDto
    {
        Name = book.Name,
        ...
        Author = new AuthorDto
        {
            Id = book.Author.Id,
            Name = book.Author.Name,
            Address = new AddressDto
            {
                Street = book.Author.Address.Street
            }
        }
    });
}

where the Address is still joined when we only query like so

{
  books {
    name
    author {
      name
    }
  }
}

@michaelstaib any plans to get this one on the roadmap now that the v13 is live? 😃

Slack channel is a good place … we can have a chat together with pascal and charter a way forward.

@dotarj See https://gist.github.com/aradalvand/9b70e8bb455f5398affba610f2a25625

You would then need to do .AsTranslatable() before returning your IQueryable from your resolvers.

Hi @kresimirlesic, see the latest dicussion at https://github.com/ChilliCream/graphql-platform/issues/6002 and specifically this comment of mine. I managed to build a plugin-esque thing that accomplishes all of this, which I’m actually really satisfied with — you can check out the prototype here.

Everything you mentioned about computed fields and ignoring sensitive members is already solved by Hot Chocolate’s type system. You can easily add new computed fields using Type Extensions and hide members from your entities using various APIs in Hot Chocolate. This does not require intermediary objects that need to be involved in the projection.

@tobias-tengler Regarding ignores, you’re right. But computed fields, not really. Because with type extensions, the new fields that you write are not added to the projection expression, which means they would result in additional SQL roundtrips, a significant disadvantage, obviously, especially if you have a relatively large number of computed fields. With DTOs and intermediary projections, however, you won’t have this problem.

There’s also the issue of developer ergonomics, I would argue it’s cleaner to have classes that directly correspond to your resulting GraphQL types. Again, with DTOs, you’ll get this benefit as well. But with type extensions and ignores, this becomes a bit more complex to look at and harder to immediately parse in your head. But this point, of course, can be subjective.

Tell me if I’m missing something.

The problem here is that we do not know the key. The key could also be e.Author.SomeForeignKeyId

@PascalSenn I think a simple method on IObjectFieldDescriptor would do:

descriptor.Field(o => o.Id)
    .Key();

Whenever the key is specified like this, Hot Chocolate projection engine would do the null checks on this property, entirely solving this problem. What do you think?

Everything you mentioned about computed fields and ignoring sensitive members is already solved by Hot Chocolate’s type system. You can easily add new computed fields using Type Extensions and hide members from your entities using various APIs in Hot Chocolate. This does not require intermediary objects that need to be involved in the projection.

@PascalSenn

The reason why this check is there, is to ensure correct nullability in the projection. If this check is not there the projection will fail if you run the query in memory.

Sure, I understand the reason for that check. I’m not saying it should be removed.

It is actually not. Most of the time you do not need these intermediary objects. Normally the Types are your DTO.

I beg to differ, Pascal. The cleanness of the underlying entity classes has nothing to do with whether or not you would need intermediary objects. Consider the following entity structure:

public class Product
{
    public int Id { get; set; }
    public string Title { get; set; }
    public ICollection<ProductRating> Ratings { get; set; }
}

public class ProductRating
{
    public int ProductId { get; set; }
    public Product Product { get; set; }
    public byte Rating { get; set; }
}

In this case, you might want the type Product in your GraphQL schema to have a field called averageRating. You don’t have such a field on your entity classes and you shouldn’t, as this is a “computed” field.

This is where intermediary projections would be necessary:

dbContext.Products.Select(p => new ProductDto
{
    AverageRating = p.Ratings.Average(r => r.Rating),
});

Apart from computed fields, there’s also a lot of low-level details in your entity structure that you would most likely want to hide from the user of your GraphQL API. An example of that would be foreign key properties like ProductId, etc. You don’t want these to appear in your GraphQL schema.

So, wouldn’t you agree with me on this?

Back to the overfetching issue as a result of the null check in the projection: What I’ve personally done is I’ve written an expression visitor that replaces every occurrence of those null checks, with null a check on the ID of the object. So it turns the following (which is what HC generates):

Author = e.Author == null ? default(AuthorDto) : new AuthorDto() { ... }

into:

Author = e.Author.Id == null ? default(AuthorDto) : new AuthorDto() { ... }

With this, the overfetching issue is sovled and EF Core generates exactly the right SQL.

I am getting the same issue where it is fetching all the columns in child instead of what is selected.

query (im using serial execution)

public class Query
{
    [UseProjection]
    public IQueryable<PersonDto> GetPersons([Service] SomeDbContext dbContext)
    {
        return dbContext.Persons.Select(person => new PersonDto
        {
            Id = person.Id,
            FirstName = person.FirstName,
            LastName = person.LastName,
            Guardian = person.Guardian != null ? new GuardianDto
            {
                Id = person.Guardian.Id,
                FirstName = person.Guardian.FirstName,
                LastName = person.Guardian.LastName
            } : default
        });
    }
}

dtos (objects)

public class PersonDto
{
    public int Id { get; set; }
    public string FirstName { get; set; } = null!;
    public string LastName { get; set; } = null!;
    public GuardianDto? Guardian { get; set; } = null!;
}

public class GuardianDto
{
    public int Id { get; set; }
    public string FirstName { get; set; } = null!;
    public string LastName { get; set; } = null!;
}

selections

query {
  persons {
    id
    firstName
    guardian {
      id
      firstName
    }
  }
}

sql

SELECT "p"."Id", "p"."FirstName", "g"."Id" IS NOT NULL, "g"."Id", "g"."FirstName", "g"."LastName"
      FROM "Persons" AS "p"
      LEFT JOIN "Guardians" AS "g" ON "p"."GuardianId" = "g"."Id"

here’s a repo if you need the fullest context: https://github.com/nollidnosnhoj/testchocolate

EDIT: another thing interesting is when i did this with a collection of children dtos, it projected correctly.