spring-boot: Memory Leak in WebFlux 3.0 + micrometer-tracing-bridge-brave + enableAutomaticContextPropagation

Hi,

In production, I’ve upgraded from spring boot webflux + zipkin 2.7.x App to spring boot webflux + zipkin 3.0.4 and I have noticed a memory leak in the heap.

I’ve distilled the problem down and there is a memory leak with the combination of WebFlux 3.0 + micrometer-tracing-bridge-brave + enableAutomaticContextPropagation. https://github.com/davidmelia/spring-cloud-function-zipkin/tree/memory_leak shows an example. If you run the app and ping the health check then over time the heap fills

#!/bin/bash
for i in {1..500}
do
   curl http://localhost:8080/actuator/health
done

You will notice brave.baggage.CorrelationUpdateScope$Multiple is never totally reclaimed by garbage collection and builds up over time:

image

I’ve plugged the heap dump into the eclipse memory analyser and it looks like reactor.netty.resources.DefaultLoopResources$EventLoop is keeping hold of thread locals:

image and drilling into the thread local there is a huge array of brave.baggage.CorrelationUpdateScope$Multiple image

N.B I appreciate this looks like a problem with either Hooks.enableAutomaticContextPropagation or micrometer-tracing-bridge-brave however it manifest in Spring Boot Webflux

(relates to https://github.com/spring-projects/spring-boot/issues/34201)

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 5
  • Comments: 24 (14 by maintainers)

Most upvoted comments

I have a workaround to get rid of memory leaks. Set this property

management:
  tracing:
    baggage:
      correlation:
        enabled: false

and set this bean

@Bean
  @Order(1)
  CorrelationScopeCustomizer myCorrelationFieldsCorrelationScopeCustomizer(TracingProperties tracingProperties) {
    return (builder) -> {
      List<String> correlationFields = tracingProperties.getBaggage().getCorrelation().getFields();
      for (String field : correlationFields) {
        builder.add(CorrelationScopeConfig.SingleCorrelationField.newBuilder(BaggageField.create(field))
                .build());
      }
    };
  }

in essence it removes the flush on update feature from Brave that is causing the memory leak. We’re still investigating the whole problem.

BTW @wilkinsona I think this bug is applicable to all Webflux projects and we should disable the flush on update feature by default for now

So before we release Micrometer 1.10.10 and all the forward merged branches (that release will happen some time within next month) the problem should be gone. You need add the snippet


@Configuration
class class Config {

   @Autowired ObservationRegistry or;

   @PostConstruct
   void setup() {
      ObservationThreadLocalAccessor.getInstance().setObservationRegistry(or);
   }
}

This code can be removed after we release the next patch versions of micrometer projects.

Micrometer Tracing 1.0.6 and 1.1.1 were released, Boot is already using these versions and the next release should contain them. You can:

  1. Upgrade your dependencies and force the fixed Micrometer Tracing version.
  2. Wait for the Boot release (planned tomorrow)

We had an issue with the release of micrometer tracing, once we’ve fixed it we will let you know

Sorry @marcingrzejszczak to be clear when I say same issue it now does work (i.e. no stack trace) but I get the original memory leak.