ray: [Core] log_to_driver=False does not suppress worker errors in ipython

What happened + What you expected to happen

When a ray task raises an error in an ipython session, I get an “Unhandled error” message in my session output after some delay (about 1 second). If I initialize ray with log_to_driver=False I still get the error. Setting environment variable RAY_IGNORE_UNHANDLED_ERRORS=1 does fix the error, but I think that log_to_driver should suppress all ray task errors. I know it suppresses warnings.

Error message:

2022-08-31 14:39:17,467	ERROR worker.py:399 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::raise_error() (pid=55178, ip=127.0.0.1)
  File "<ipython-input-1-4cb397045297>", line 8, in raise_error
RuntimeError

This error is relevant to Modin, which saves object ids from remote function calls. Modin users in ipython get spammed with ray task errors in addition to getting the nicer exception in the main thread caused by finally ray.geting the oid from the task that raised the error. It would be excellent for the Modin user experience if there were a way to suppress the errors. RAY_IGNORE_UNHANDLED_ERRORS=1 doesn’t work because we shouldn’t update the user’s environment. Suppressing errors with log_to_driver=False would be great.

Versions / Dependencies

ray 2.0.0 macOS Monterey 12.4 python 3.10.4 ipython 8.4.0

Reproduction script

import ray
import time

ray.init(log_to_driver=False)

@ray.remote
def raise_error():
  raise RuntimeError()

oid = raise_error.remote()
time.sleep(5)

Issue Severity

Low: It annoys or frustrates me.

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 16 (10 by maintainers)

Most upvoted comments

Case 1: an exception object is GCed before it’s accessed/handled, then it’s an unhandled error so we will have the error message regardless of interactive or script mode. (@scv119 this is the case you described)

Case 2: an exception object is accessed/handled within 5s after it’s created, then we won’t have the error message in either mode.

Case 3: an exception object is accessed/handled 5s after it’s created, then interactive mode will show the error message but script mode won’t. (this is the case @mvashishtha encountered and we should see how to fix).