ray: [Bug][RLlib] Gym environment registration does not work when using Ray Client and ray.init

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

RLlib

What happened + What you expected to happen

When using RLlib and Ray Client then you will receive an error (see below) when relying on: ray.init(f"ray://127.0.0.1:10001") whereas things work when using: export RAY_ADDRESS="ray://127.0.0.1:10001"

In particular this error only happens when using the default gym registered strings. When using a custom registration then code runs as expected.

So:

  • gym-string + ray.init -> error
  • gym-string + RAY_ADDRESS -> works
  • self-registration + ray.init -> works
  • self-registration + RAY_ADDRESS -> works
2022-01-20 03:24:32,339 INFO trainer.py:2054 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
Traceback (most recent call last):
  File "rllib4.py", line 28, in <module>
    trainer = PPOTrainer(config=config)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 728, in __init__
    super().__init__(config, logger_creator, remote_checkpoint_dir,
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/tune/trainable.py", line 122, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 754, in setup
    self.env_creator = _global_registry.get(ENV_CREATOR, env)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/tune/registry.py", line 168, in get
    return pickle.loads(value)
EOFError: Ran out of input

Versions / Dependencies

Ray 1.10.0-py38 Docker image with TensorFlow installed.

>>> ray.__commit__
'1583379dce891e96e9721bb958e80d485753aed7'
>>> ray.__version__
'1.10.0'

Reproduction script

# Import the RL algorithm (Trainer) we would like to use.
import ray

ray.init(f"ray://127.0.0.1:10001")  # Comment out to make this work.

from ray.rllib.agents.ppo import PPOTrainer
from ray.tune.registry import register_env
from gym.envs.classic_control.cartpole import CartPoleEnv

def env_creator(config):
    return CartPoleEnv()

register_env("my_env", env_creator)


# Configure the algorithm.
config = {
    # Environment (RLlib understands openAI gym registered strings).
    "env" : "CartPole-v1",  # <-- Fails
    #"env" : "my_env",  # <-- Works
    "num_workers": 2,
    "framework": "tf"
}

trainer = PPOTrainer(config=config)
for _ in range(3):
    print(trainer.train())


Anything else

Happens always.

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

This is a P0 issue from our side. @ericl CC