ray: [Rllib] Using Dict state space throws exception (not supported yet?)
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux 16.04
- Ray installed from (source or binary): Source
- Ray version:0.5.3
- Python version: 3.6.6
- Exact command to reproduce:
Describe the problem
I am trying to use Dict state space with PPO but it throws an exception. More details below.
Source code / logs
class MyServing(ServingEnv):
def __init__(self):
ServingEnv.__init__(
self, spaces.Box(-1.0, 1.0, (1,), dtype=np.float32),
spaces.Dict({"image":spaces.Box(0.0, 1.0, (img_height, img_width, 3), dtype=np.float32),
"speed": spaces.Box(0.0, 1.0, (1,), dtype=np.float32)}))
def run(self):
print("Starting policy server at {}:{}".format(SERVER_ADDRESS,
SERVER_PORT))
server = PolicyServer(self, SERVER_ADDRESS, SERVER_PORT)
server.serve_forever()
if __name__ == "__main__":
register_my_model()
ray.init(num_gpus=1)
register_env("srv", lambda _: MyServing())
# We use DQN since it supports off-policy actions, but you can choose and
# configure any agent.
ppo = PPOAgent(
env="srv",
config={
# Use a single process to avoid needing to set up a load balancer
"num_workers": 0,
"num_gpus": 1,
"batch_mode": "complete_episodes",
"train_batch_size": 2000,
"model": {
"custom_model": "mymodel"
},
#"num_gpus":1,
# Configure the agent to run short iterations for debugging
#"exploration_fraction": 0.01,
#"learning_starts": 100,
#"timesteps_per_iteration": 200,
#"schedule_max_timesteps": 100000,
#"gamma": 0.8,
"tf_session_args": {
"gpu_options": {"allow_growth": True},
},
})
File "/RL/ray-master/ray/python/ray/rllib/my_scripts/ppo/udacity_server_ppo.py", line 142, in <module>
"gpu_options": {"allow_growth": True},
File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 216, in __init__
Trainable.__init__(self, config, logger_creator)
File "/RL/ray-master/ray/python/ray/tune/trainable.py", line 86, in __init__
self._setup()
File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 258, in _setup
self._init()
File "/RL/ray-master/ray/python/ray/rllib/agents/ppo/ppo.py", line 85, in _init
self.env_creator, self._policy_graph)
File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 131, in make_local_evaluator
"inter_op_parallelism_threads": None,
File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 171, in _make_evaluator
monitor_path=self.logdir if config["monitor"] else None)
File "/RL/ray-master/ray/python/ray/rllib/evaluation/policy_evaluator.py", line 228, in __init__
policy_dict, policy_config)
File "/RL/ray-master/ray/python/ray/rllib/evaluation/policy_evaluator.py", line 286, in _build_policy_map
policy_map[name] = cls(obs_space, act_space, merged_conf)
File "/RL/ray-master/ray/python/ray/rllib/agents/ppo/ppo_policy_graph.py", line 123, in __init__
shape=(None, ) + observation_space.shape)
TypeError: can only concatenate tuple (not "NoneType") to tuple
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 2
- Comments: 15 (1 by maintainers)
@ericl I can confirm that, the dict state space works now. Thanks for the fix.