ray: [core] "soft" option for NodeAffinitySchedulingStrategy doesn't seem to work
What happened + What you expected to happen
The problem
The following script should be able to utilize all CPU slots on a cluster, but when you run it it will only use CPUs on the head node.
This is a blocker for Dataset streaming, which uses this scheduling hint for best-effort locality optimization. Not supporting “soft” properly means locality scheduling can severely degrade performance.
Versions / Dependencies
master / py39
Reproduction script
import time
import ray
from ray.util.scheduling_strategies import NodeAffinitySchedulingStrategy
ray.init()
cur_node = ray.get_runtime_context().get_node_id()
@ray.remote
def f():
time.sleep(1)
print("done")
start = time.time()
ray.get([
f.options(scheduling_strategy=NodeAffinitySchedulingStrategy(cur_node, soft=True)).remote()
for _ in range(1000)
])
print(time.time() - start)
Reproduction workspace: https://console.anyscale-staging.com/o/anyscale-internal/workspaces/expwrk_1ncv7mqbfapj4ax5sa1679seep/ses_msttcvnbvzbgzefa2hia2kui9h?command-history-section=head_start_up_log
Issue Severity
High: It blocks me from completing my task.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 24 (19 by maintainers)
Commits related to this issue
- [core] node affinity soft chooses another node if it is not available (#33856) Addressed https://github.com/ray-project/ray/issues/33835 to enable spreading of dataset tasks Co-authored-by: Claren... — committed to ray-project/ray by clarng a year ago
- [core] node affinity soft chooses another node if it is not available (#33856) Addressed https://github.com/ray-project/ray/issues/33835 to enable spreading of dataset tasks Co-authored-by: Clarence... — committed to joncarter1/ray by clarng a year ago
- [core] node affinity soft chooses another node if it is not available (#33856) Addressed https://github.com/ray-project/ray/issues/33835 to enable spreading of dataset tasks Co-authored-by: Clarence... — committed to elliottower/ray by clarng a year ago
- [core] node affinity soft chooses another node if it is not available (#33856) Addressed https://github.com/ray-project/ray/issues/33835 to enable spreading of dataset tasks Co-authored-by: Clarence... — committed to ProjectsByJackHe/ray by clarng a year ago
I think the ideal case would be for soft to just have one meaning, “spill on busy or failure”, but we should only do this if the implementation for “on busy” is good enough. What @jjyao said about the shuffle use case is right, but I don’t believe we necessarily need “spill on failure” semantics; I think it’s just that the performance is more predictable with this policy. I’m not sure if there are other use cases that need “spill on failure” but not “spill on busy”.
I’d say we should try out “spill on busy or failure” on the push-based shuffle release test and also add some CI tests for the behavior. If these results are good, we can change the semantics. Otherwise, we should probably just add a second flag.
If we want well defined semantics, I think we should have explicit flags about the behaviors, such as spill_on_failure=True, spill_on_busy=True. Otherwise, I’d favor soft providing best-effort scheduling, defined as “try to put it here, but if not possible, fall back to other nodes. Your code should be able to tolerate the task getting placed on other nodes”.