quarkus: SmallRye GraphQL source fields metrics not working
Describe the bug
Good day to You!
I try use GraphQL and achieve Micrometer metrics for my endpoints. I read Docs and see base metrics implementation in SmallRye GraphQL repository.
But, when i test my sandbox project metrics and events (both EventingService implementation and Observes) behaviour looks strange.
Please, could you help me explain and resolve 3 things?
- Metrics for
@Sourcefields always 0 it is add on startup with other operations, but then don’t calculate becauseafterDataFetchobserver method never execute for Source fields; - I create batch
@Sourceoperation and in code this method execute once, but in same timebeforeDataFetchexecute lots of times for each record in my batch list; - Because
afterDataFetchnot execute for@Sourcefields context not removed from map (for metrics calculate) (Long startTime = startTimes.remove(context);) and map increment size.
Maybe i have a wrong view, but do we have a way for calculate execution metrics for source fields in same event base way like for general operations?
I try this in both default (Blocking) and NonBlocking way.
Expected behavior
I execute query 1:
query findHeroById {
selectedHero: hero(heroId: "100") {
__typename
id
name
slug
stats: statsInfo {
__typename
intelligence
power
combat
speed
strength
durability
}
}
}
and query 2:
query heroesList {
heroes: heroCollection {
__typename
id
name
slug
stats: statsInfo {
__typename
intelligence
power
combat
speed
strength
durability
}
}
}
And metrics should calculate:
mp_graphql_seconds_count{name="statsInfo",source="true",type="QUERY",} 0.0
mp_graphql_seconds_sum{name="statsInfo",source="true",type="QUERY",} 0.0
And afterDataFetch should fire for source fields and i see in logs:
2343534452 - Start fetch data: statsInfo
2343534452 - Complete fetch data: statsInfo
And “metrics map” should not increase size.
Actual behavior
Metrics have zero values:
mp_graphql_seconds_count{name="statsInfo",source="true",type="QUERY",} 0.0
mp_graphql_seconds_sum{name="statsInfo",source="true",type="QUERY",} 0.0
Only beforeDataFetch message in logs.
2343534452 - Start fetch data: statsInfo
And “metrics map” increase size after each execution.
== startTimes map size - 10 ==
How to Reproduce?
Reproducer example: https://github.com/akHelix4444/Quarkus-Playground/tree/r-gql-events-metrics Branch: r-gql-events-metrics
Execute queries:
query findHeroById {
selectedHero: hero(heroId: "100") {
__typename
id
name
slug
stats: statsInfo {
__typename
intelligence
power
combat
speed
strength
durability
}
}
}
query heroesList {
heroes: heroCollection {
__typename
id
name
slug
stats: statsInfo {
__typename
intelligence
power
combat
speed
strength
durability
}
}
}
See logs and metrics values as describe above.
Output of uname -a or ver
No response
Output of java -version
openjdk version “17.0.7” 2023-04-18 OpenJDK Runtime Environment Temurin-17.0.7+7 (build 17.0.7+7) OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (build 17.0.7+7, mixed mode, sharing)
GraalVM version (if different from Java)
No response
Quarkus version or git rev
2.16.7.Final (2.13.8.Final - same behaviour)
Build tool (ie. output of mvnw --version or gradlew --version)
Gradle 8.1.1 Build time: 2023-04-21 12:31:26 UTC Revision: 1cf537a851c635c364a4214885f8b9798051175b Kotlin: 1.8.10 Groovy: 3.0.15 Ant: Apache Ant™ version 1.10.11 compiled on July 10 2021 JVM: 17.0.7 (Eclipse Adoptium 17.0.7+7) OS: Windows 11 10.0 amd64
Additional information
No response
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 18 (9 by maintainers)
I confirm it’s not something I want to backport.
@akHelix4444 I’m afraid the internal changes are quite substantial so I’m hesitant to backport them into a stream where we should be really conservative 😦
FYI this will (hopefully) be resolved via https://github.com/smallrye/smallrye-graphql/pull/1875 in Quarkus 3.3
It is possible to write a CDI interceptor that handles asynchronous methods correctly, but it needs work. Here’s a simple example, in the real world, it will likely be more complex (e.g. will need to handle
Uniin addition to / instead ofCompletionStage):I think reactive code (asynchronous return types) will be a problem when using CDI interceptors, these are called right after the method itself returns, not when the asynchronous code finishes.
We don’t make any difference between successful and failed operations for metric purposes. And with the current eventing model, it may not be possible at all. In fact, I think the current eventing model is unsuitable for collecting metrics in general. We will probably need to rethink metrics completely, and collect them by instrumenting data fetchers instead.