quarkus: SmallRye GraphQL source fields metrics not working

Describe the bug

Good day to You!

I try use GraphQL and achieve Micrometer metrics for my endpoints. I read Docs and see base metrics implementation in SmallRye GraphQL repository.

But, when i test my sandbox project metrics and events (both EventingService implementation and Observes) behaviour looks strange.

Please, could you help me explain and resolve 3 things?

  1. Metrics for @Source fields always 0 it is add on startup with other operations, but then don’t calculate because afterDataFetch observer method never execute for Source fields;
  2. I create batch @Source operation and in code this method execute once, but in same time beforeDataFetch execute lots of times for each record in my batch list;
  3. Because afterDataFetch not execute for @Source fields context not removed from map (for metrics calculate) (Long startTime = startTimes.remove(context);) and map increment size.

Maybe i have a wrong view, but do we have a way for calculate execution metrics for source fields in same event base way like for general operations?

I try this in both default (Blocking) and NonBlocking way.

Expected behavior

I execute query 1:

query findHeroById {
  selectedHero: hero(heroId: "100") {
    __typename
    id
    name
    slug
    stats: statsInfo {
      __typename
      intelligence
      power
      combat
      speed
      strength
      durability
    }
  }
}

and query 2:

query heroesList {
  heroes: heroCollection {
    __typename
    id
    name
    slug
    stats: statsInfo {
      __typename
      intelligence
      power
      combat
      speed
      strength
      durability
    }
  }
}

And metrics should calculate:

mp_graphql_seconds_count{name="statsInfo",source="true",type="QUERY",} 0.0
mp_graphql_seconds_sum{name="statsInfo",source="true",type="QUERY",} 0.0

And afterDataFetch should fire for source fields and i see in logs:

2343534452 - Start fetch data: statsInfo
2343534452 - Complete fetch data: statsInfo

And “metrics map” should not increase size.

Actual behavior

Metrics have zero values:

mp_graphql_seconds_count{name="statsInfo",source="true",type="QUERY",} 0.0
mp_graphql_seconds_sum{name="statsInfo",source="true",type="QUERY",} 0.0

Only beforeDataFetch message in logs.

2343534452 - Start fetch data: statsInfo

And “metrics map” increase size after each execution.

== startTimes map size - 10 ==

How to Reproduce?

Reproducer example: https://github.com/akHelix4444/Quarkus-Playground/tree/r-gql-events-metrics Branch: r-gql-events-metrics

Execute queries:

query findHeroById {
  selectedHero: hero(heroId: "100") {
    __typename
    id
    name
    slug
    stats: statsInfo {
      __typename
      intelligence
      power
      combat
      speed
      strength
      durability
    }
  }
}
query heroesList {
  heroes: heroCollection {
    __typename
    id
    name
    slug
    stats: statsInfo {
      __typename
      intelligence
      power
      combat
      speed
      strength
      durability
    }
  }
}

See logs and metrics values as describe above.

Output of uname -a or ver

No response

Output of java -version

openjdk version “17.0.7” 2023-04-18 OpenJDK Runtime Environment Temurin-17.0.7+7 (build 17.0.7+7) OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (build 17.0.7+7, mixed mode, sharing)

GraalVM version (if different from Java)

No response

Quarkus version or git rev

2.16.7.Final (2.13.8.Final - same behaviour)

Build tool (ie. output of mvnw --version or gradlew --version)

Gradle 8.1.1 Build time: 2023-04-21 12:31:26 UTC Revision: 1cf537a851c635c364a4214885f8b9798051175b Kotlin: 1.8.10 Groovy: 3.0.15 Ant: Apache Ant™ version 1.10.11 compiled on July 10 2021 JVM: 17.0.7 (Eclipse Adoptium 17.0.7+7) OS: Windows 11 10.0 amd64

Additional information

No response

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 18 (9 by maintainers)

Most upvoted comments

I confirm it’s not something I want to backport.

@akHelix4444 I’m afraid the internal changes are quite substantial so I’m hesitant to backport them into a stream where we should be really conservative 😦

FYI this will (hopefully) be resolved via https://github.com/smallrye/smallrye-graphql/pull/1875 in Quarkus 3.3

It is possible to write a CDI interceptor that handles asynchronous methods correctly, but it needs work. Here’s a simple example, in the real world, it will likely be more complex (e.g. will need to handle Uni in addition to / instead of CompletionStage):

@MyInterceptorBinding
@Interceptor
@Priority(1)
public class MyInterceptor {
    @AroundInvoke
    Object intercept(InvocationContext ctx) {
        if (isAsync(ctx.getMethod())) {
            // do something before proceeding
            CompletableFuture result = new CompletableFuture();
            CompletionStage future = (CompletionStage) ctx.proceed(); // may need `try` / `catch` to handle exceptions
            future.whenComplete((value, error) -> {
                // do something after proceeding
                result.complete(...); // or `completeExceptionally(...)`
            });
            return result;
        } else {
            // do something before proceeding
            Object result = ctx.proceed(); // may need `try` / `catch` to handle exceptions
            // do something after proceeding
            return result;
        }
    }

    boolean isAsync(Method method) {
        return CompletionStage.class.isAssignableFrom(method.getReturnType());
    }
}

I think reactive code (asynchronous return types) will be a problem when using CDI interceptors, these are called right after the method itself returns, not when the asynchronous code finishes.

We don’t make any difference between successful and failed operations for metric purposes. And with the current eventing model, it may not be possible at all. In fact, I think the current eventing model is unsuitable for collecting metrics in general. We will probably need to rethink metrics completely, and collect them by instrumenting data fetchers instead.