Gymnasium: [Bug Report] max_episode_steps is not passed to the env's spec attribute anymore
Describe the bug
In previous versions of gym, an env registered with max_episode_steps=N
could see its env.spec.max_episode_steps
refelect this value.
Now this attribute is automatically set to None even if the env is explicitely registered with this
Would it make sense to keep the value from the registration in the env spec, or set it to None only if max_episode_steps
is passed when make
is called, ie
# max_episode_steps is proper to the env
register(envname0, max_episode_steps=N)
make(envname0) # env.unwrapped.spec.max_episode_steps == N
# max_episode_steps is just there to tell us to wrap it in a TimeLimit
register(envname1, max_episode_steps=None)
make(envname0, max_episode_steps=None) # env.unwrapped.spec.max_episode_steps == None
Otherwise, it’s hard for us to know what the env horizon is (we don’t need a TimeLimit, the env is terminated at max_episode_steps
regardless of that)
Happy to make a PR to solve this issue
cc @vikashplus
Code example
No response
System info
No response
Additional context
No response
Checklist
- I have checked that there is no similar issue in the repo
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Comments: 17 (12 by maintainers)
I have been dealing with the same bug since past week with different envs
I guess the question is: if it’s in the spec during registration, why isn’t it in the unwrapped env.spec? I get why it can’t be found if it’s passed during a call to
gym.make
, but if you pass it duringregister
to me you’re saying that this is an intrinsic attribute of your env, and hence something that you may want to access at runtime. We’ve found a convoluted way of recovering that info but I think the semantic here isn’t super clear for the users (especially given that it used to work and was pretty convenient, and many of us would think this feature made sense)I’m dealing with the same issue. Here is a small example
The situation gets worse when an env has multiple wrappers. The env fails to access its own horizon via
self.spec.max_episode_steps
. All of the followings (ee.env.spec.max_episode_steps, ee.env.env.env.spec.max_episode_steps, ee.env.env.env.unwrapped.spec.max_episode_steps) ends up returningNone