distributed: Exception chaining (raise Exception() from ...)

distributed 2.6.0 discards the traceback of all but the latest exception in a chain:

def f():
    try:
        raise Exception("foo")
    except Exception as e:
        raise Exception("bar") from e

f()

Output:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-2-707a8f24b7c1> in f()
      2     try:
----> 3         raise Exception("foo")
      4     except Exception as e:

Exception: foo

The above exception was the direct cause of the following exception:

Exception                                 Traceback (most recent call last)
<ipython-input-2-707a8f24b7c1> in <module>
      5         raise Exception("bar") from e
      6 
----> 7 f()

<ipython-input-2-707a8f24b7c1> in f()
      3         raise Exception("foo")
      4     except Exception as e:
----> 5         raise Exception("bar") from e
      6 
      7 f()

Exception: bar
import distributed
client = distributed.Client()
client.submit(f).result()

Output:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-3-5d89c629add7> in <module>
----> 1 client.submit(f).result()

distributed/client.py in result(self, timeout)
    220         if self.status == "error":
    221             typ, exc, tb = result
--> 222             raise exc.with_traceback(tb)
    223         elif self.status == "cancelled":
    224             raise result

<ipython-input-2-707a8f24b7c1> in f()
      3         raise Exception("foo")
      4     except Exception as e:
----> 5         raise Exception("bar") from e
      6 
      7 f()

Exception: bar

Without looking at the source code of dask distributed, I can guess what the issue is:

import pickle
import sys

try:
    f()
except Exception:
    exc_type, exc, tb = sys.exc_info()

exc, tb = pickle.loads(pickle.dumps((exc, tb)))
raise exc.with_traceback(tb)

Output:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-76-1fc21a18f772> in <module>
      8 
      9 exc, tb = pickle.loads(pickle.dumps((exc, tb)))
---> 10 raise exc.with_traceback(tb)

<ipython-input-76-1fc21a18f772> in <module>
      3 
      4 try:
----> 5     f()
      6 except Exception:
      7     exc_type, exc, tb = sys.exc_info()

<ipython-input-2-707a8f24b7c1> in f()
      3         raise Exception("foo")
      4     except Exception as e:
----> 5         raise Exception("bar") from e
      6 
      7 f()

Exception: bar

Solution:

class ExceptionChainPickler:
    def __init__(self, exc: BaseException):
        self.exc = exc

    def __reduce__(self):             
        cur_exc = self.exc
        exc_chain = []
        while cur_exc:
            exc_chain.append((cur_exc, cur_exc.__traceback__))
            cur_exc = cur_exc.__cause__
        return self.expand, tuple(exc_chain)

    @staticmethod
    def expand(*exc_chain) -> BaseException:
        for (exc, tb), (cause, _) in zip(exc_chain, exc_chain[1:]):
            exc.__traceback__ = tb
            exc.__cause__ = cause
        exc_chain[-1][0].__traceback__ = exc_chain[-1][1]
        return exc_chain[0][0]


try:
    f()
except Exception as exc:
    ep = ExceptionChainPickler(exc)

exc = pickle.loads(pickle.dumps(ep))
raise exc

Output:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-2-707a8f24b7c1> in f()
      2     try:
----> 3         raise Exception("foo")
      4     except Exception as e:

Exception: foo

The above exception was the direct cause of the following exception:

Exception                                 Traceback (most recent call last)
<ipython-input-81-f7743f5d6c40> in <module>
     26 
     27 exc = pickle.loads(pickle.dumps(ep))
---> 28 raise exc

<ipython-input-81-f7743f5d6c40> in <module>
     21 
     22 try:
---> 23     f()
     24 except Exception as exc:
     25     ep = ExceptionChainPickler(exc)

<ipython-input-2-707a8f24b7c1> in f()
      3         raise Exception("foo")
      4     except Exception as e:
----> 5         raise Exception("bar") from e
      6 
      7 f()

Exception: bar

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 23 (11 by maintainers)

Most upvoted comments

If you work on a 10k lines of code worth of project, and you rely on third party libraries worth another 200k lines of code, you very much don’t want to have your whole program bomb out with a mysterious KeyError: "John Doe", because it’s something that will take a lot of work to reproduce, even with a stack trace - because the stack trace says nothing about the context of the error, that is all the other local and global variables. And if the exception was dumped out to a log in a production server in semi-unknown conditions, good luck to you. You know exactly which line of code triggered the exception, but you have no idea whatsoever about how that unexpected key ended up there.

If instead you could read in the log:

KeyError: "John Doe"

The above exception was the direct cause of the following exception:

UserDatabaseError: failed to load user credentials for user="John Doe"

The above exception was the direct cause of the following exception:

PageRenderError: failed to render /foo/bar/baz.html?user=John%20Doe

all while retaining the full stack trace, you’d save A LOT of time reproducing and debugging the issue!

Sure, without exception chaining, you’ll see from the stack that your KeyError: "John Doe" was caused by, say, htmlrender.py. So you know it’s triggered by rendering a page. Which of the 500 pages of your site? With which rendering parameters? God knows.

A lot of software that doesn’t use exception chaining either emits the very first, nebulous error KeyError: "John Doe" that says nothing about the context or the very last one PageRenderError: failed to render /foo/bar/baz.html?user=John%20Doe which won’t have a meaningful stack trace and will likely not contain enough information to reproduce or debug the issue. The latter is, in fact, very close to the dreaded This application has executed an invalid operation and will be terminated.

Some software has intermediate checkpoints that emit to the log file and then re-raise, e.g.

try:
    user = UserDatabase.get_user(username)
except Exception as e:
    logger.error(f"UserDatabase.get_user({username}) failed: {e}")
    raise

which is better than nothing, but you still have to hunt through the log file. And if your application was doing more unrelated things at the same time (e.g. a web server), you’ll have to go through A LOT of noise. Exception chaining, on the other hand, will carry the whole thing to the very bottom for you to read at once.