tracing: Traceid and spanid wrong when using Kotlin Reactor Coroutines
Hello. When using Kotlin Reactor Coroutines like the following:
return mono {
val text = "Greetings from Spring Boot!"
logger.error(text)
text
}
then the traceId and spanId that are being included in the log statement above look like the following:
ERROR [00000000000000000000000000000000,0000000000000000] 80834 --- [atcher-worker-1] com.example.demo.HelloController : Greetings from Spring Boot!
The traceId and spanId in the logs are always 0 (zeros) .
This happens only for log statements inside such mono {}
blocks. Other log statements outside of these blocks work as expected.
I have created a sample reproducer project here:
https://github.com/akoufa/springbootissues
Just start the Spring Boot app and use the following curl request:
curl --location 'localhost:8080/'
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 3
- Comments: 52 (2 by maintainers)
Commits related to this issue
- Adds distributed tracing but with bug in coroutines context: https://github.com/micrometer-metrics/tracing/issues/174 — committed to omprakashsridharan/spring-boot-playground by omprakashsridharan a year ago
Hey, I am starting working on that and may ask guidance to Kotin team on the best way to fix that. This kind of issue seems to impact both Micrometer and Spring as soon as
kotlinx.coroutines.reactor.mono
is involved.Hint: you can also use
spring.reactor.context-propagation=auto
property instead of callingHooks.enableAutomaticContextPropagation()
by hand since Spring Boot 3.2.0.FYI, on the Spring side, for Controllers, the coroutine entry point is provided by
InvocableHandlerMethod
class.For Web MVC:
https://github.com/spring-projects/spring-framework/blob/e5ee369e70e0a8ca04437f502649b11171a1c741/spring-web/src/main/java/org/springframework/web/method/support/InvocableHandlerMethod.java#L201
Its
doInvoke(...)
checks if the function is a Kotlin suspend function and, if so, ultimately callsCoroutinesUtils.invokeSuspendingFunction(...)
Notably, it does not give a
CoroutineContext
object, so the coroutine is started with no context.For WebFlux:
https://github.com/spring-projects/spring-framework/blob/c3e28728ce5c11655831cbc20f828e956e1c42a1/spring-webflux/src/main/java/org/springframework/web/reactive/result/method/InvocableHandlerMethod.java#L133
Similar to Web MVC, it calls
CoroutinesUtils.invokeSuspendingFunction(...)
without giving aCoroutineContext
.Maybe Spring should take whatever is in Reactor Context at this point and copy it into Coroutine Context?
EDIT: I don’t have a suggested workaround, a fix probably needs to happen in Spring itself.
EDIT 2: There are two
InvocableHandlerMethod
implementations, one for Web MVC and one of WebFlux.@meberhard This works for me so please double check:
If you can reproduce, please create a Spring Framework issue with a self contained repro as an attached archive or a link to a repository.
@grassehh is correct -
enableAutomaticContextPropagation()
needs to be the first thing you do @meberhardI will have a deeper look to the 2 issues raised on Spring Framework side, and to this Stackoverflow question.
@meberhard looks like the
coRouter
implementation does not pick up the context that is stored byCoWebFilter
in an exchange attribute. In fact I suspect that it’s better to do the following:The only place in Spring Framework that is handling this attribute otherwise is this: https://github.com/spring-projects/spring-framework/blob/main/spring-webflux/src/main/java/org/springframework/web/reactive/result/method/InvocableHandlerMethod.java#L296 which is not getting called for
coRouter
handlers (just checked with a debugger).@sdeleuze I’ve got a question on the current
CoWebFilter
implementation. What happens if I actually have multipleCoWebFilter
implementations in my project and they all manipulate CoroutineContext somehow?Edit: looks like my question about CoWebFilter was already asked here: https://github.com/spring-projects/spring-framework/issues/27522#issuecomment-1734248442 . I’ll raise a separate issue to follow up.
For me, the tracing works as expected using the
CoWebFilter
(basically the same code that @grassehh shared.What I don’t get: Why doesn’t this work when I use spring’s
coRouter
instead of spring’s@RestController
? I can see that thefilter
function of theCoWebFilter
is also invoked when using thecoRouter
. But: The trace/mdccontext is dropped if I usedelay
withcoRouter
.On some additional research I found the following: https://github.com/spring-projects/spring-framework/issues/27010
So when I add the context to my
coRouter
, the tracing and mdcContext is working (also afterdelay
):What I don’t get:
CoWebFilter
is also invoked and sets the context - why is this not enough and I also have to set the context on the level of thecoRouter
?context
for thecoRouter
globally -> I like it better to have it set in a central way as inCoWebFilter
, otherwise I would have to add it to everycoRouter
.@grassehh here are the changes I made to your repo - basically just upgrading versions to get to Spring Framework 6.1 snapshots and the code in https://github.com/micrometer-metrics/tracing/issues/174#issuecomment-1727691192. With that I was able to get tracing context in MDC in controllers using coroutines: https://github.com/grassehh/spring-boot-3-tracing-coroutine/compare/main...shakuzen:CoWebFilter?expand=1
I did not look more into the logbook stuff.
I have just added the last part of Spring Framework 6.1
CoroutineContext
refinement: it is now possible to propagate it viaCoWebFilter
. https://github.com/spring-projects/spring-framework/issues/27522I have fixed the nested route + context issue via https://github.com/spring-projects/spring-framework/issues/31831, so please test and report any remaining problem with Spring Framework
6.1.2-SNAPSHOT
(release planned tomorrow).I tried removing the two lines
-> same result, no context/traceId in the
onError
method.Also tried that -> moved this to the WebConfig, same result (no context/traceId in the
onError
method).I tried it both - with the
ThreadLocal
stuff that you have in your example and without.Just FYI - with
@RestController
only the lineHooks.enableAutomaticContextPropagation()
is required - so probably we can drop the other lines also.good point -> I don’t even know why it worked at all with the
@RestController
like that. I anyway also moved this line before the run - same result: no context/traceId in theonError
call.Awesome! Thanks a lot for this 🙏
@marcingrzejszczak I am trying to use
observationRegistry.asContextElement()
but it seems to be losing my logging… a small snippet of how I am using itI am using spring boot and injecting observationRegistry, not sure if that is proper usage.
Also I can get the “has no context” logging properly if I change
withContext(observationRegistry.asContextElement()) {
withwithContext(MDCContext(context)) {
whereval context = MDC.getCopyOfContextMap()
is set before CoroutineScope. However that last logger.info always loses contextI didn’t say that. I’m just replying to @vooft comment