ray: [Ray Serve] Unable to connect to GCS with ray start --head, but works from inside python

What happened + What you expected to happen

When I try follow this tutorial for deploying on a single node, and I start up a ray head node using ray start --head, it fails to start up (see below error).

However, when I start a server up from inside a python script it works as expected (see below). I want to be able to do it the prior way to make use of Serve’s ability to dynamically update running deployments.

Versions / Dependencies

ray, version 1.12.1
Redis server v=6.0.15 sha=00000000:0 malloc=jemalloc-5.2.1 bits=64 build=d583da279d383435

Reproduction script

ray start --head

Observe the following

2022-05-18 10:13:12,091 WARNING utils.py:1254 -- Unable to connect to GCS at 10.0.0.105:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.

However, this works:

import ray
from ray import serve

serve.start()

while True:
  pass

Observe:

2022-05-18 10:34:37,061	INFO services.py:1456 -- View the Ray dashboard at http://127.0.0.1:8265
(ServeController pid=10417) 2022-05-18 10:34:40,010	INFO checkpoint_path.py:15 -- Using RayInternalKVStore for controller checkpoint and recovery.
(ServeController pid=10417) 2022-05-18 10:34:40,118	INFO http_state.py:106 -- Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:yZdKhI:SERVE_PROXY_ACTOR-node:10.0.0.105-0' on node 'node:10.0.0.105-0' listening on '127.0.0.1:8000'
2022-05-18 10:34:40,986	INFO api.py:794 -- Started Serve instance in namespace 'serve'.

Issue Severity

High: It blocks me from completing my task.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 21 (11 by maintainers)

Most upvoted comments

Great that it works now!

There were no actors:

$ ray stop
Did not find any active Ray processes.

What worked was stopping redis service:

$ sudo service redis-server stop

Then ray start --head worked.