spring-cloud-sleuth: o.s.c.s.a.z.ZipkinAutoConfiguration : Check result of the [WebClientSender{https://my-zipkin:443/api/v2/spans}] contains an error [CheckResult{ok=false, error=java.util.concurrent.TimeoutException: WebClientSender{https://my-zipkin:443/api/v2/spans} check() timed out after 1000ms}]
Hello Spring Cloud Sleuth Team,
I am reaching out because I am observing a string issue, hence would like to report it.
Setup: The application is a Spring Boot 2.6.4 + Jubilee 2021.0.1 We have another Zipkin server running fine, in my example, I call it https://my-zipkin:443 Here is my pom for the sleuth/zipkin part:
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.6.4</version>
<relativePath/>
</parent>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-dependencies</artifactId>
<version>2021.0.1</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-cassandra-reactive</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-kubernetes-fabric8</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-config-client</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-sleuth-zipkin</artifactId>
</dependency>
<dependency>
<groupId>io.projectreactor.netty</groupId>
<artifactId>reactor-netty-http-brave</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-micrometer</artifactId>
</dependency>
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-reactor</artifactId>
</dependency>
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-spring-boot2</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-contract-verifier</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.lmax</groupId>
<artifactId>disruptor</artifactId>
<version>3.4.4</version>
</dependency>
<dependency>
<groupId>org.springframework.statemachine</groupId>
<artifactId>spring-statemachine-core</artifactId>
<version>3.0.1</version>
</dependency>
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>7.0.1</version>
</dependency>
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-starter-client</artifactId>
<version>2.6.2</version>
</dependency>
<dependency>
<groupId>org.springdoc</groupId>
<artifactId>springdoc-openapi-webflux-ui</artifactId>
<version>1.6.6</version>
</dependency>
</dependencies>
Reproducible 100%, on each application startup, I am seeing:
2022-03-05 12:16:35.477 WARN [myservice,,] 10 --- [ main] o.s.c.s.a.z.ZipkinAutoConfiguration : Check result of the [WebClientSender{https://my-zipkin:443/api/v2/spans}] contains an error [CheckResult{ok=false, error=java.util.concurrent.TimeoutException: WebClientSender{https://zmy-zipkin:443/api/v2/spans} check() timed out after 1000ms}]
The thing is, I am monitoring the Zipkin server in parallel, and it is up and running fine! Other non Spring Boot application connects to it fine during the same time stamp, etc.
What is also strange is later with the same app (the one having the warning) I do receive requests, and I am able to search the traces in the Zipkin server, proof it successfully connected to Zipkin Server, as I am able to see the full traces.
Therefore, I would like to report this issue, not understanding what is going in there. Why is there a time out where apparently, server is working fine, and client is able to send traces to it?
Is there a way to increase this time out from 1000ms to something a bit higher?
Thank you
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 2
- Comments: 17 (8 by maintainers)
Hi, I also encountered this problem, there will be an error using reactor, how did you solve the problem of timeout?
As I said previously, I’m using Jaeger (docker command on my first post) which provides a Zipkin compatilibity
Indeed Jaeger returns a 400 HTTP error If I don’t provide the “Content-Type” header, where as Zipkin server doesn’t seem to care about that header.
It feels like the header should be present as we send an POST request with a json body. We can also add the fact that the rest template implementation explicitly set this header, and the webclient one doesn’t.
So that clears out the 1st problem with the HTTP Error. Still the webclient implementation seems to be buggy, it clearly shows a timeout exception, but my curl command respond in ~5ms. Have you tested my example app ?
It timesout in both cases (Zipkin or Jaeger), I’ve just tested it.
If I start with “spring-boot-starter-web” dependency, no issues If I start with “spring-boot-starter-webflux” dependency instead, I get the timeout
So it’s clearly related to the WebClient implementation of the checkResult.
Just saw this today with the latest 2021.0.2. Wanted to say thank you @marcingrzejszczak and thanks everyone on this thread