spring-cloud-sleuth: Correlation scope misaligned error in a Spring Cloud Gateway application
Use case I’m trying to integrate Spring Cloud Sleuth in a Spring Cloud Gateway application. My final goal is to add a custom field to the trace context to be propagated to downstream services and to log that custom field in logs though MDC.
Describe the bug When the application receives an HTTP request, an assertion error is thrown and the application returns 500:
java.lang.AssertionError: Misalignment: popped updateScope false != expected false
at brave.baggage.CorrelationFlushScope.popCurrentUpdateScope(CorrelationFlushScope.java:86) ~[brave-5.13.2.jar!/:na]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
|_ checkpoint ? org.springframework.cloud.sleuth.instrument.web.TraceWebFilter [DefaultWebFilterChain]
|_ checkpoint ? HTTP GET "/get?customId=1" [ExceptionHandlingWebHandler]
Stack trace:
at brave.baggage.CorrelationFlushScope.popCurrentUpdateScope(CorrelationFlushScope.java:86) ~[brave-5.13.2.jar!/:na]
at brave.baggage.CorrelationFlushScope.close(CorrelationFlushScope.java:37) ~[brave-5.13.2.jar!/:na]
at brave.baggage.CorrelationUpdateScope$Multiple.close(CorrelationUpdateScope.java:108) ~[brave-5.13.2.jar!/:na]
at org.springframework.cloud.sleuth.brave.bridge.BraveScope.close(BraveCurrentTraceContext.java:99) ~[spring-cloud-sleuth-brave-3.0.3.jar!/:3.0.3]
at org.springframework.cloud.sleuth.instrument.web.TraceWebFilter$MonoWebFilterTrace.subscribe(TraceWebFilter.java:185) ~[spring-cloud-sleuth-instrumentation-3.0.3.jar!/:3.0.3]
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64) [reactor-core-3.4.9.jar!/:3.4.9]
at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) ~[reactor-core-3.4.9.jar!/:3.4.9]
at reactor.core.publisher.Mono.subscribe(Mono.java:4338) ~[reactor-core-3.4.9.jar!/:3.4.9]
at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:255) ~[reactor-core-3.4.9.jar!/:3.4.9]
at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51) ~[reactor-core-3.4.9.jar!/:3.4.9]
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64) [reactor-core-3.4.9.jar!/:3.4.9]
at reactor.netty.http.server.HttpServer$HttpServerHandle.onStateChange(HttpServer.java:915) ~[reactor-netty-http-1.0.10.jar!/:1.0.10]
at reactor.netty.ReactorNetty$CompositeConnectionObserver.onStateChange(ReactorNetty.java:654) ~[reactor-netty-core-1.0.10.jar!/:1.0.10]
at reactor.netty.transport.ServerTransport$ChildObserver.onStateChange(ServerTransport.java:478) ~[reactor-netty-core-1.0.10.jar!/:1.0.10]
at reactor.netty.http.server.HttpServerOperations.onInboundNext(HttpServerOperations.java:517) ~[reactor-netty-http-1.0.10.jar!/:1.0.10]
at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:93) ~[reactor-netty-core-1.0.10.jar!/:1.0.10]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at reactor.netty.http.server.HttpTrafficHandler.channelRead(HttpTrafficHandler.java:208) ~[reactor-netty-http-1.0.10.jar!/:1.0.10]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) ~[netty-codec-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) ~[netty-codec-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[netty-transport-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[netty-common-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.67.Final.jar!/:4.1.67.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.67.Final.jar!/:4.1.67.Final]
at java.lang.Thread.run(Unknown Source) ~[na:1.8.0_201]
The error seems to occur when spring.sleuth.reactor.instrumentation-type=decorate_queues
and when the application is configured to flush to MDC on field updates.
The error does not occur when spring.sleuth.reactor.instrumentation-type=decorate_on_each
.
Sample A test case is provided in this repo: https://github.com/ntuveri/mdc-scope-misaligned The error can be reproduced with the following commands:
.\gradlew.bat bootJar
java -enableassertions -jar .\build\libs\mdc-scope-misaligned-0.0.1-SNAPSHOT.jar
- Request the URL http://localhost:8080/get?customId=1
I will be glad to provide further information
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 8
- Comments: 22 (8 by maintainers)
Commits related to this issue
- Fixing invalid scoping in TraceHandlerFunction; fixes gh-2026 — committed to spring-cloud/spring-cloud-sleuth by marcingrzejszczak 2 years ago
Yeah that’s possible. I’ll try to look into that ASAP. The Reactor instrumentation is… problematic 😬
Thanks @marcingrzejszczak! Yes, I can file a PR with the solution between now and tomorrow.
Hi @marcingrzejszczak @ntuveri
I have the same problem on my project when I run it with decorate_queues instrumentation type. First, I have been debugging my application, and after, I have checked that in the @ntuveri 's example happen the same problem. I have seen the following code in the HookRegisteringBeanDefinitionRegistryPostProcessor class (TraceReactorAutoConfiguration.java file):
Here it is regenerating a scope that it will never be closed. That is the problem, a CorrelationUpdateScope instance has been created in decorateScope(…) method inside CorrelationScopeDecorator class. In CorrelationFlushScope class, that instance of CorrelationUpdateScope is pushed in updateScopeStack stack but this instance will never be popped because scope object created in restoreTheContext() is never closed.
When the error happens (Misalignment: popped updateScope false != expected false), I have checked that the popped CorrelationUpdateScope instance is the instance created from scope generated in the restoreTheContext() method. The instrumentation tries to close a scope but its associated CorrelationUpdateScope isn’t the last in the stack. Because of that, the instrumentation fails.
Sorry if my explanation is not clear enough. If it is necessary additional information, I can try to explain better the problem.
Thanks for you time!
Most likely this was written before I wrote the proper reactor instrumentation and I’ve forgotten to remove this. Can you file a PR with a fix? We would see if things are still working fine.
Unless the
on each
oron queue wrapping
instrumentation works ok for you the solution for now would be to go with manual instrumentation