gqlgen: Optimized batch loading without blocking

I’m trying to understand how gqlgen can optimize batch loading lazily/concurrently, given that all resolvers are synchronous — i.e. they return the value itself, instead of a thunk (future).

I was reading through the code example for data loading to try to understand how dataloaden works. In the its code, I noticed that Load(key) simply calls LoadThunk(key)(). Then I noticed this line in the readme: “This method will block for a short amount of time, waiting for any other similar requests to come in”. Is this how gqlgen assumes it will work?

If you look at the graphql-go project, all resolvers can return either values or thunks. This means the entire GraphQL tree can be resolved to thunks, which can then queue up all the keys they need so that when the thunks are called, all the objects can be fetched at once.

Can someone fill me in on how gqlgen is supposed to handle this? A resolver like this:

func (r *rootQueryResolver) Things(ctx context.Context, uids []*string, limit *int) ([]*models.Thing, error) {
	// ...
}

…cannot defer any work at all; it has to return. The only way to batch this is to execute all the resolvers concurrently in goroutines, then have some kind of queue that keeps track of when they’ve all resolved their keys — or just wait for an arbitrary interval. That seems wrong to me, and all the attendant synchronization sounds like it would be super inefficient.

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Reactions: 15
  • Comments: 28

Most upvoted comments

By safe I mostly mean having a single correct resolver signature that can be put into the interface.

The simplest solution (which at this point would of course break existing users) would be to always return a function, which is not particularly annoying for non-thunk loaders:

func (r *orderResolver) Items(ctx context.Context, obj *Order) (func() ([]Item, error), error) {
	return func() ([]Item, error) {
		// ...
	}, nil
}

When a thunk panics its stack is from the thing dispatching thunks, when a goroutine panics the stack is what you expect, each resolver that lead to this error.

Yet both will have a stack trace pointing within gqlgen’s resolving hierarchy. I honestly don’t see the difference, but maybe I’m not seeing something.

That said if you are hitting performance issues and have some benchmarks to share

Well, I’m merely evaluating gqlgen at this point. I’ve started writing app that uses graphql-go/graphql, but I like the “schema first” approach and generating code, since it means both the GraphQL code and the Go types can be generated from the same time. I’m not overly concerned with performance for this app, but I don’t like the added complexity. Using goroutines instead of thunks adds a lot of synchronization at the data loading level. I’ve already written my batch loading using the graph-gophers/dataloader library, which I’m quite happy with. It’s simple and easy to understand.

Having looked at a few of these GraphQL libraries, I’m of the opinion that a server-side GraphQL implementation like gqlgen should absolutely have batch loading built in, because it is in the nature of GraphQL to express many fetches in a single query. An app that doesn’t need batch loading can choose not to batch-load, but those will in the minority.

Not stale.

It’s been some time since this has been discussed. I would very much like this feature added.

@vektah Any ideas on how to handle this? It’s becoming a growing concern for us.

Hi, I’m evaluating gqlgen as well for our internal use.

notifying dataloaden when all goroutines have been dispatched and the scheduler has had a chance to run. This is enough to avoid the sleep.

I agree and this is what other implementations that do not use promises e.g. PHP are doing.

By safe I mostly mean having a single correct resolver signature that can be put into the interface. Go’s type system isnt powerful enough to allow multiple different return types and maintain type safety.

Since gqlgen itself already requires code generation step, if the dataloading mechanism (dataloaden) can also be generated at the same time (possibly in the same configuration file etc) then this should be less of an issue.

Having looked at a few of these GraphQL libraries, I’m of the opinion that a server-side GraphQL implementation like gqlgen should absolutely have batch loading built in, because it is in the nature of GraphQL to express many fetches in a single query.

+1. This is also what the PHP implementation above does, as well as other solid implementations like Sangria do.

the stack traces if something goes wrong gets exceedingly messy

How so? With the proposed solution, fetching a field that is marked as batched would simply be divided into two phases - fetching the IDs, and fetching the data using the IDs. If one of those steps goes wrong, you would get the exact same stack traces as if you had a single resolver.
Just to clarify what this would look like from the library users’ perspective:

  1. Mark a field as batched in gqlgen.yml
models:
  Foo:
    model: ...
    batch:
        # Defining both model and type allows
        # us to define custom ID types.
        - field: field1
          model: string
        - field: field2
          model: github.com/99designs/gqlgen/graphql.Int64
  1. 2 Resolvers are generated for you instead of one:
interface FooResolver {
  // The ID resolver lets us return only the ID type we defined, and defer
  // loading of the actual data.
  Field1ID(ctx context.Context, obj *model.Foo) (string, error)
  Field2ID(ctx context.Context, obj *model.Foo) (graphql.Int64, error)

  // The second part is the data resolver, which hands us IDs and expects
  // us to resolve them to actual entities.
  Field1Data(ctx context.Context, ids []string) (map[string]*model.Field1, error)
  Field2Data(ctx context.Context, ids []graphql.Int64), map[graphql.Int64]*model.Field2, error)
}
  1. Now, for each entity where a batch resolver is present, all requests for 1 “level” will automatically be collected and only resolved once.

Am I missing something?

The only way to batch this is to execute all the resolvers concurrently in goroutines [… and] just wait for an arbitrary interval.

This is what gqlgen does, we eliminate a bunch of places where creating goroutines is unnessicary (binding to fields directly, or methods that dont take a context) and goroutines are cheap enough - benchmarks. There are still a few optimization targets here.

There are optimizations we can make on both sides:

  • a. pooling goroutines to avoid morestack (see https://github.com/99designs/gqlgen/pull/465)
  • b. notifying dataloaden when all goroutines have been dispatched and the scheduler has had a chance to run. This is enough to avoid the sleep.

Overarching goal for gqlgen is for clean, readable, safe code. I dont think trading clean stacktraces and return values for extra milliseconds (way less if b is actioned)