ray: [Core] getting/creating an actor from multiple thread errors

What happened + What you expected to happen

Creating/getting an actor from multiple threads like this:

import ray
import threading
import time
import random


@ray.remote
class bar:
        pass

def foo():
        time.sleep(random.random())
        bar.options(name="bar", namespace="bar_name", get_if_exists=True, lifetime="detached").remote()


threads = []
for i in range(1000):
        threads.append(threading.Thread(target=foo))

for thread in threads:
        thread.start()

for thread in threads:
        thread.join()

sometimes results an error:

Traceback (most recent call last):
  File "/Users/andrewxue/anaconda3/envs/ray/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/Users/andrewxue/anaconda3/envs/ray/lib/python3.9/threading.py", line 917, in run
    return ray.get_actor(name, namespace=namespace)
  File "/Users/andrewxue/fork/ray/python/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper
    self._target(*self._args, **self._kwargs)
  File "/Users/andrewxue/fork/ray/python/ray/data/tests/test.py", line 13, in foo
    bar.options(name="bar", namespace="bar_name", get_if_exists=True, lifetime="detached").remote()
  File "/Users/andrewxue/fork/ray/python/ray/actor.py", line 687, in remote
    return fn(*args, **kwargs)
  File "/Users/andrewxue/fork/ray/python/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
  File "/Users/andrewxue/fork/ray/python/ray/_private/worker.py", line 2845, in get_actor
    return actor_cls._remote(args=args, kwargs=kwargs, **updated_options)
  File "/Users/andrewxue/fork/ray/python/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper
    return worker.core_worker.get_named_actor_handle(name, namespace or "")
  File "python/ray/_raylet.pyx", line 4021, in ray._raylet.CoreWorker.get_named_actor_handle
    return fn(*args, **kwargs)
  File "/Users/andrewxue/fork/ray/python/ray/util/tracing/tracing_helper.py", line 388, in _invocation_actor_class_remote_span
    return method(self, args, kwargs, *_args, **_kwargs)
  File "/Users/andrewxue/fork/ray/python/ray/actor.py", line 781, in _remote
    return ray.get_actor(name, namespace=namespace)
  File "/Users/andrewxue/fork/ray/python/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper
  File "python/ray/_raylet.pyx", line 453, in ray._raylet.check_status
    return fn(*args, **kwargs)
  File "/Users/andrewxue/fork/ray/python/ray/_private/client_mode_hook.py", line 103, in wrapper
ValueError: Failed to look up actor with name 'bar'. This could because 1. You are trying to look up a named actor you didn't create. 2. The named actor died. 3. You did not use a namespace matching the namespace of the actor.

Versions / Dependencies

latest master

Reproduction script

script given above

Issue Severity

Medium: It is a significant difficulty but I can work around it.

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 16 (15 by maintainers)

Commits related to this issue

Most upvoted comments

I was able to reproduce the issue. Looking further.

Can you link the GH issue when you get a chance @jobh ?

@anyscalesam I think this is the one: https://github.com/ray-project/ray/issues/44083

Btw, it’d be nicer to think about how to safely enable thread-safety for APIs. right now, my impression is it is kind of happened to work (and prone to be broken).

@jobh Thanks for reporting. Can you open a separate issue and link this one? We can combine them when investigations provide more evidence they are the same.

discussed internal to core - we’ll possibly plan for this in ray210