kotlinx.coroutines: Version 1.1.0 breaks referential equality of thrown exceptions

When the flag kotlinx.coroutines.stacktrace.recovery is set to true, as per the default, exceptions thrown within the library are mutated or cloned so they lose referential equality properties.

We found this regression in Arrow when updating our adapter module to 1.1.0. We run a test suite that included some referential equality tests in this version. These referential equality tests were removed in the update process, as you can see in master. At least another library was affected, Vert.x, as stated here.

Even if this one fix could be applied to keep the Arrow adapter and Vert.x releases going, there is merit to the discussion of avoiding different behaviors between debug and production, and breaking a property that’s respected by other concurrency libraries such as Project Reactor, RxJava, Scalaz, and Scala/Cats.

An alternative like Throwable#addSuppressed doesn’t break referential equality, even if it mutates the exception. Defaulting to false and making the environment variable more visible in the documentation is another viable alternative 😄

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 10
  • Comments: 16 (9 by maintainers)

Most upvoted comments

@pakoito Type Classes have to be linked at build time, it cannot become a pluggable solution.

As alternative solution, it is possible to provide an external component to rebuild the right stack trace, so we does not have to reinstantiate the Throwable. Having this external mechanism, it is possible to log the right stack trace when coroutine fails, then throwing the original exception. This solution does not alter the execution flow and it fills the log with debugging information. Finally it is possible to provide something like fun Throwable.coroutineStackTrace() for developing purpose.

What we can do is to provide some way to indicate that exception is non-copyable or to instruct kotlinx.coroutines how to copy particular exception class.

We can either introduce a global settable state (global functions like disableCloning(exceptionClazz) etc.) or marker interfaces that exception class should implement. The former introduces a global settable state and may have conflicts between multiple libraries, but can be used with classes that you cannot change, while the latter is more intrusive for the application/library code.

I am open to discuss other solutions or concerns about proposed ones.

Fixed in 1.2.0-alpha

This behaviour is opt-out rather than opt-in to help newbies with exception handling. Making this behaviour opt-in is much less discoverable and useful.

The period of being a newbie is short, while there are many medior/experts (and probably some of those newbies too) that’ll get bitten by this issue and will need to go google it up, potentially hurting their product and losing valuable engineering time. Having it opt-in with documentation prevents this, and still helps newbies if instead you improve the discoverability and documentation of the parameter for them.

Not talking about the fact that broken stacktraces and complicated debugging are one of the most common complains about Reactor and RxJava.

Complicated debugging because it’s noisy and it doesn’t record thread jumps, not because there is anything broken about them nor they cause problems in client code. A completely different problem than the one described by this report.

We can either introduce a global settable state (global functions like disableCloning(exceptionClazz) etc.) or marker interfaces that exception class should implement. The former introduces a global settable state and may have conflicts between multiple libraries, but can be used with classes that you cannot change, while the latter is more intrusive for the application/library code.

With typeclasses as per KEEP-87 you could get the best of both 😃

In this case it’d have to be the first to work for any exception because that’s precisely the worst and most common case where you’ll wish it existed, and have documentation somewhere about the tradeoff and the fix.

I am having exactly the same issue. I rely on some error codes (int based) provided inside an Exception, and whatever the error code is originally, it always ends up being 0 once it has gone through “resumeWithException” inside my “suspendCoroutine” block. This change breaks all the error handling in my application.