mani_skill.vector.wrappers.gymnasium#
Classes#
Gymnasium Vector Env implementation for ManiSkill environments running on the GPU for parallel simulation and optionally parallel rendering |
Module Contents#
- class mani_skill.vector.wrappers.gymnasium.ManiSkillVectorEnv(env, num_envs=1, auto_reset=True, ignore_terminations=False, record_metrics=False, **kwargs)[source]#
Bases:
gymnasium.vector.VectorEnvGymnasium Vector Env implementation for ManiSkill environments running on the GPU for parallel simulation and optionally parallel rendering
- Parameters:
env (Union[gymnasium.Env, str]) – The environment created via gym.make / after wrappers are applied. If a string is given, we use gym.make(env) to create an environment
num_envs (int) – The number of parallel environments. This is only used if the env argument is a string
env_kwargs – Environment kwargs to pass to gym.make. This is only used if the env argument is a string
auto_reset (bool) – Whether this wrapper will auto reset the environment (following the same API/conventions as Gymnasium). Default is True (recommended as most ML/RL libraries use auto reset)
ignore_terminations (bool) – Whether this wrapper ignores terminations when deciding when to auto reset. Terminations can be caused by the task reaching a success or fail state as defined in a task’s evaluation function. Default is False, meaning there is early stop in episode rollouts. If set to True, this would generally for situations where you may want to model a task as infinite horizon where a task stops only due to the timelimit.
record_metrics (bool) – If True, the returned info objects will contain the metrics: return, length, success_once, success_at_end, fail_once, fail_at_end. success/fail metrics are recorded only when the environment has success/fail criteria. success/fail_at_end are recorded only when ignore_terminations is True.
- render()[source]#
Returns the rendered frames from the parallel environments.
- Returns:
A tuple of rendered frames from the parallel environments
- reset(*, seed=None, options=None)[source]#
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed (Optional[Union[int, list[int]]]) – The environment reset seed
options (Optional[dict]) – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
Example
>>> import gymnasium as gym >>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync") >>> observations, infos = envs.reset(seed=42) >>> observations array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32) >>> infos {}
- step(actions)[source]#
Take an action for each parallel environment.
- Parameters:
actions (Union[mani_skill.utils.structs.types.Array, dict]) – Batch of actions with the
action_spaceshape.- Returns:
Batch of (observations, rewards, terminations, truncations, infos)
- Return type:
Tuple[mani_skill.utils.structs.types.Array, mani_skill.utils.structs.types.Array, mani_skill.utils.structs.types.Array, mani_skill.utils.structs.types.Array, dict]
Note
As the vector environments autoreset for a terminating and truncating sub-environments, this will occur on the next step after terminated or truncated is True.
Example
>>> import gymnasium as gym >>> import numpy as np >>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync") >>> _ = envs.reset(seed=42) >>> actions = np.array([1, 0, 1], dtype=np.int32) >>> observations, rewards, terminations, truncations, infos = envs.step(actions) >>> observations array([[ 0.02727336, 0.18847767, 0.03625453, -0.26141977], [ 0.01431748, -0.24002443, -0.04731862, 0.3110827 ], [-0.03822722, 0.1710671 , -0.00848456, -0.2487226 ]], dtype=float32) >>> rewards array([1., 1., 1.]) >>> terminations array([False, False, False]) >>> terminations array([False, False, False]) >>> infos {}
- property base_env: mani_skill.envs.sapien_env.BaseEnv[source]#
- Return type: