spring-kafka: RetryableTopic - breaking changes in the error handler

In what version(s) of Spring for Apache Kafka are you seeing this issue?

2.8.4.

Describe the bug

ListenerContainerFactoryConfigurer now creates its error handler here:

https://github.com/spring-projects/spring-kafka/blob/93efeb4f604b5c96810a57bc584c18c194b7c321/spring-kafka/src/main/java/org/springframework/kafka/retrytopic/ListenerContainerFactoryConfigurer.java#L235-L246

There are two breaking changes/regressions compared to version 2.8.3:

  • createDefaultErrorHandlerInstance no longer provides a “no retry” back off strategy by default. The outcome of this change is that a to-be-recovered record is retried 10 times instead of immediately being sent into the next recovery topic.
  • errorHandler.defaultFalse completely disables the default classification of the fatal exceptions (such as DeserializationException.class). In fact, every classification request now returns false.

The second change causes every back off exception to be printed at error level due to the code here:

https://github.com/spring-projects/spring-kafka/blob/f0ad7b0c0b2005eb2fd58fd361eacbfe007c7284/spring-kafka/src/main/java/org/springframework/kafka/listener/FailedRecordProcessor.java#L143-L164

getClassifier().classify(thrownException) now always returns false and this.failureTracker.getRecoverer().accept may throw a back off exception, which does not reach the following special back off exception handler any longer:

https://github.com/spring-projects/spring-kafka/blob/93efeb4f604b5c96810a57bc584c18c194b7c321/spring-kafka/src/main/java/org/springframework/kafka/listener/SeekUtils.java#L109-L113

To Reproduce

Unfortunately, I do not have a sample right now, but I will try to prepare it this week. I believe that any application that uses the retryable topics recovery mechanism is impacted.

Expected behavior

I expect the default behaviour of the retryable topics recovery system to be identical to version 2.8.3.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (12 by maintainers)

Commits related to this issue

Most upvoted comments

No problem, I totally understand that starting to see ERROR logs everywhere would lead to worries.

It was really good that you brought this up early - now we had a chance to double check the behavior and see that the changes we made are ok functionality-wise, and can also start looking for a workaround and fix for the logs. So thanks a lot for that.

Specially because, well, if that’s really the case, people should be seeing these logs everywhere, so I expect to see this issue popping around again here and on StackOverflow.

It also occurred to me that the KafkaBackOffException may get in the way of the blocking retries and count against them, so I’ll also look into that.