mani_skill.vector.wrappers.gymnasium
====================================

.. py:module:: mani_skill.vector.wrappers.gymnasium


Classes
-------

.. autoapisummary::

   mani_skill.vector.wrappers.gymnasium.ManiSkillVectorEnv


Module Contents
---------------

.. py:class:: ManiSkillVectorEnv(env, num_envs = 1, auto_reset = True, ignore_terminations = False, record_metrics = False, **kwargs)

   Bases: :py:obj:`gymnasium.vector.VectorEnv`


   Gymnasium Vector Env implementation for ManiSkill environments running on the GPU for parallel simulation and optionally parallel rendering

   :param env: The environment created via gym.make / after wrappers are applied. If a string is given, we use gym.make(env) to create an environment
   :param num_envs: The number of parallel environments. This is only used if the env argument is a string
   :param env_kwargs: Environment kwargs to pass to gym.make. This is only used if the env argument is a string
   :param auto_reset: Whether this wrapper will auto reset the environment (following the same API/conventions as Gymnasium).
                      Default is True (recommended as most ML/RL libraries use auto reset)
   :type auto_reset: bool
   :param ignore_terminations: Whether this wrapper ignores terminations when deciding when to auto reset. Terminations can be caused by
                               the task reaching a success or fail state as defined in a task's evaluation function. Default is False, meaning there is early stop in
                               episode rollouts. If set to True, this would generally for situations where you may want to model a task as infinite horizon where a task
                               stops only due to the timelimit.
   :type ignore_terminations: bool
   :param record_metrics: If True, the returned info objects will contain the metrics: return, length, success_once, success_at_end, fail_once, fail_at_end.
                          success/fail metrics are recorded only when the environment has success/fail criteria. success/fail_at_end are recorded only when ignore_terminations is True.
   :type record_metrics: bool


   .. py:method:: call(name, *args, **kwargs)


   .. py:method:: close_extras(**kwargs)

      Clean up the extra resources e.g. beyond what's in this base class.


   .. py:method:: get_attr(name)


   .. py:method:: render()

      Returns the rendered frames from the parallel environments.

      :returns: A tuple of rendered frames from the parallel environments


   .. py:method:: reset(*, seed = None, options = None)

      Reset all parallel environments and return a batch of initial observations and info.

      :param seed: The environment reset seed
      :param options: If to return the options

      :returns: A batch of observations and info from the vectorized environment.

      .. rubric:: Example

      >>> import gymnasium as gym
      >>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync")
      >>> observations, infos = envs.reset(seed=42)
      >>> observations
      array([[ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ],
             [ 0.01522993, -0.04562247, -0.04799704,  0.03392126],
             [-0.03774345, -0.02418869, -0.00942293,  0.0469184 ]],
            dtype=float32)
      >>> infos
      {}


   .. py:method:: step(actions)

      Take an action for each parallel environment.

      :param actions: Batch of actions with the :attr:`action_space` shape.

      :returns: Batch of (observations, rewards, terminations, truncations, infos)

      .. note::

         As the vector environments autoreset for a terminating and truncating sub-environments, this will occur on
         the next step after `terminated or truncated is True`.

      .. rubric:: Example

      >>> import gymnasium as gym
      >>> import numpy as np
      >>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync")
      >>> _ = envs.reset(seed=42)
      >>> actions = np.array([1, 0, 1], dtype=np.int32)
      >>> observations, rewards, terminations, truncations, infos = envs.step(actions)
      >>> observations
      array([[ 0.02727336,  0.18847767,  0.03625453, -0.26141977],
             [ 0.01431748, -0.24002443, -0.04731862,  0.3110827 ],
             [-0.03822722,  0.1710671 , -0.00848456, -0.2487226 ]],
            dtype=float32)
      >>> rewards
      array([1., 1., 1.])
      >>> terminations
      array([False, False, False])
      >>> terminations
      array([False, False, False])
      >>> infos
      {}


   .. py:attribute:: auto_reset
      :value: True


   .. py:property:: base_env
      :type: mani_skill.envs.sapien_env.BaseEnv


   .. py:property:: device


   .. py:attribute:: ignore_terminations
      :value: False


   .. py:attribute:: num_envs
      :value: 1


   .. py:attribute:: record_metrics
      :value: False


   .. py:attribute:: spec


   .. py:property:: unwrapped

      Return the base environment.