mani_skill.envs.sapien_env#
Classes#
Superclass for ManiSkill environments. |
Module Contents#
- class mani_skill.envs.sapien_env.BaseEnv(num_envs=1, obs_mode=None, reward_mode=None, control_mode=None, render_mode=None, shader_dir=None, enable_shadow=False, sensor_configs=dict(), human_render_camera_configs=dict(), viewer_camera_configs=dict(), robot_uids=None, sim_config=dict(), reconfiguration_freq=None, sim_backend='auto', render_backend='gpu', parallel_in_single_scene=False, enhanced_determinism=False)[source]#
Bases:
gymnasium.EnvSuperclass for ManiSkill environments.
- Parameters:
num_envs (int) – number of parallel environments to run. By default this is 1, which means a CPU simulation is used. If greater than 1, then we initialize the GPU simulation setup. Note that not all environments are faster when simulated on the GPU due to limitations of GPU simulations. For example, environments with many moving objects are better simulated by parallelizing across CPUs.
obs_mode (Optional[str]) – observation mode to be used. Must be one of (“state”, “state_dict”, “none”, “sensor_data”, “rgb”, “depth”, “segmentation”, “rgbd”, “rgb+depth”, “rgb+depth+segmentation”, “rgb+segmentation”, “depth+segmentation”, “pointcloud”) The obs_mode is mostly for convenience to automatically optimize/setup all sensors/cameras for the given observation mode to render the correct data and try to ignore unnecessary rendering. For the most advanced use cases (e.g. you have 1 RGB only camera and 1 depth only camera)
reward_mode (Optional[str]) – reward mode to use. Must be one of (“normalized_dense”, “dense”, “sparse”, “none”). With “none” the reward returned is always 0
control_mode (Optional[str]) – control mode of the agent. “*” represents all registered controllers, and the action space will be a dict.
render_mode (Optional[str]) – render mode registered in @SUPPORTED_RENDER_MODES.
shader_dir (Optional[str]) –
shader directory. Defaults to None. Setting this will override the shader used for all cameras in the environment. This is legacy behavior kept for backwards compatibility. The proper way to change the shaders used for cameras is to either change the environment code or pass in sensor_configs/human_render_camera_configs with the desired shaders.
Previously the options are “default”, “rt”, “rt-fast”. “rt” means ray-tracing which results in more photorealistic renders but is slow, “rt-fast” is a lower quality but faster version of “rt”.
enable_shadow (bool) – whether to enable shadow for lights. Defaults to False.
sensor_configs (dict) – configurations of sensors to override any environment defaults. If the key is one of sensor names (e.g. a camera), the config value will be applied to the corresponding sensor. Otherwise, the value will be applied to all sensors (but overridden by sensor-specific values). For possible configurations see the documentation see the sensors documentation.
human_render_camera_configs (dict) – configurations of human rendering cameras to override any environment defaults. Similar usage as @sensor_configs.
viewer_camera_configs (dict) – configurations of the viewer camera in the GUI to override any environment defaults. Similar usage as @sensor_configs.
robot_uids (Union[str, BaseAgent, list[Union[str, BaseAgent]]]) – list of robots to instantiate and control in the environment.
sim_config (Union[SimConfig, dict]) – Configurations for simulation if used that override the environment defaults. If given a dictionary, it can just override specific attributes e.g.
sim_config=dict(scene_config=dict(solver_iterations=25)). If passing in a SimConfig object, while typed, will override every attribute including the task defaults. Some environments define their own recommended default sim configurations via theself._default_sim_configattribute that generally should not be completely overriden.reconfiguration_freq (int) – How frequently to call reconfigure when environment is reset via self.reset(…) Generally for most users who are not building tasks this does not need to be changed. The default is 0, which means the environment reconfigures upon creation, and never again.
sim_backend (str) – By default this is “auto”. If sim_backend is “auto”, then if
num_envs == 1, we use the PhysX CPU sim backend, otherwise we use the PhysX GPU sim backend and automatically pick a GPU to use. Can also be “physx_cpu” or “physx_cuda” to force usage of a particular sim backend. To select a particular GPU to run the simulation on, you can pass “physx_cuda:n” where n is the ID of the GPU, similar to the way PyTorch selects GPUs. Note that if this is “physx_cpu”, num_envs can only be equal to 1.render_backend (str) –
By default this is “gpu”. If render_backend is “gpu” or it’s alias “sapien_cuda”, then we auto select a GPU to render with. It can be “sapien_cuda:n” where n is the ID of the GPU to render with. If this is “cpu” or “sapien_cpu”, then we try to render on the CPU. If this is “none” or None, then we disable rendering.
Note that some environments may require rendering functionalities to work. Moreover it is sometimes difficult to determine before running an environment if your machine can render or not. If you encounter some issue with rendering you can first try to double check your NVIDIA drivers / Vulkan drivers are setup correctly. If you don’t need to do rendering you can simply disable it by setting render_backend to “none” or None.
parallel_in_single_scene (bool) – By default this is False. If True, rendered images and the GUI will show all objects in one view. This is only really useful for generating cool videos showing all environments at once but it is not recommended otherwise as it slows down simulation and rendering.
enhanced_determinism (bool) – By default this is False and env resets will reset the episode RNG only when a seed / seed list is given. If True, the environment will reset the episode RNG upon each reset regardless of whether a seed is provided. Generally enhanced_determinisim is not needed and users are recommended to pass seeds into the env reset function instead.
- _after_control_step()[source]#
Code that runs after each action has been taken. On GPU simulation this is called right before observations are fetched from the GPU buffers.
- _after_reconfigure(options)[source]#
Add code here that should run immediately after self._reconfigure is called. The torch RNG context is still active so RNG is still seeded here by self._episode_seed. This is useful if you need to run something that only happens after reconfiguration but need the GPU initialized so that you can check e.g. collisons, poses etc.
- _before_control_step()[source]#
Code that runs before each action has been taken via env.step(action). On GPU simulation this is called before observations are fetched from the GPU buffers.
- _clear()[source]#
Clear the simulation scene instance and other buffers. The function can be called in reset() before a new scene is created. Called by self._reconfigure and when the environment is closed/deleted
- _get_obs_agent()[source]#
Get observations about the agent’s state. By default it is proprioceptive observations which include qpos and qvel. Controller state is also included although most default controllers do not have any state.
- _get_obs_extra(info)[source]#
Get task-relevant extra observations. Usually defined on a task by task basis
- Parameters:
info (dict)
- _get_obs_sensor_data(apply_texture_transforms=True)[source]#
Get data from all registered sensors. Auto hides any objects that are designated to be hidden
- Parameters:
apply_texture_transforms (bool) – Whether to apply texture transforms to the simulated sensor data to map to standard texture formats. Default is True.
- Returns:
A dictionary containing the sensor data mapping sensor name to its respective dictionary of data. The dictionary maps texture names to the data. For example the return could look like
{ "sensor_1": { "rgb": torch.Tensor, "depth": torch.Tensor }, "sensor_2": { "rgb": torch.Tensor, "depth": torch.Tensor } }
- Return type:
dict
- _get_obs_state_dict(info)[source]#
Get (ground-truth) state-based observations.
- Parameters:
info (dict)
- _get_obs_with_sensor_data(info, apply_texture_transforms=True)[source]#
Get the observation with sensor data
- Parameters:
info (dict)
apply_texture_transforms (bool)
- Return type:
dict
- _initialize_episode(env_idx, options)[source]#
Initialize the episode, e.g., poses of actors and articulations, as well as task relevant data like randomizing goal positions
- Parameters:
env_idx (torch.Tensor)
options (dict)
- _load_agent(options, initial_agent_poses=None, build_separate=False)[source]#
loads the agent/controllable articulations into the environment. The default function provides a convenient way to setup the agent/robot by a robot_uid (stored in self.robot_uids) without requiring the user to have to write the robot building and controller code themselves. For more advanced use-cases you can override this function to have more control over the agent/robot building process.
- Parameters:
options (dict) – The options for the environment.
initial_agent_poses (Optional[Union[sapien.Pose, Pose]]) – The initial poses of the agent/robot. Providing these poses and ensuring they are picked such that they do not collide with objects if spawned there is highly recommended to ensure more stable simulation (the agent pose can be changed later during episode initialization).
build_separate (bool) – Whether to build the agent/robot separately. If True, the agent/robot will be built separately for each parallel environment and then merged together to be accessible under one view/object. This is useful for randomizing physical and visual properties of the agent/robot which is only permitted for articulations built separately in each environment.
- _load_lighting(options)[source]#
Loads lighting into the scene. Called by self._reconfigure. If not overriden will set some simple default lighting
- Parameters:
options (dict)
- _load_scene(options)[source]#
Loads all objects like actors and articulations into the scene. Called by self._reconfigure. Given options argument is the same options dictionary passed to the self.reset function
- Parameters:
options (dict)
- _reconfigure(options=dict())[source]#
Reconfigure the simulation scene instance. This function clears the previous scene and creates a new one.
Note this function is not always called when an environment is reset, and should only be used if any agents, assets, sensors, lighting need to change to save compute time.
Tasks like PegInsertionSide and TurnFaucet will call this each time as the peg shape changes each time and the faucet model changes each time respectively.
- _set_episode_rng(seed, env_idx)[source]#
Set the random generator for current episode.
- Parameters:
seed (Union[None, list[int], numpy.ndarray[Any], int])
env_idx (torch.Tensor)
- _set_main_rng(seed)[source]#
Set the main random generator which is only used to set the seed of the episode RNG to improve reproducibility.
Note that while _set_main_rng and _set_episode_rng are setting a seed and numpy random state, when using GPU sim parallelization it is highly recommended to use torch random functions as they will make things run faster. The use of torch random functions when building tasks in ManiSkill are automatically seeded via torch.random.fork
- Parameters:
seed (Optional[Union[int, list[int], numpy.ndarray]])
- _setup_scene()[source]#
Setup the simulation scene instance. The function should be called in reset(). Called by self._reconfigure
- _setup_sensors(options)[source]#
Setup sensor configurations and the sensor objects in the scene. Called by self._reconfigure
- Parameters:
options (dict)
- _setup_viewer()[source]#
Setup the interactive viewer.
The function should be called after a new scene is configured. In subclasses, this function can be overridden to set viewer cameras.
Called by self._reconfigure
- _step_action(action)[source]#
- Parameters:
action (Union[None, mani_skill.utils.structs.types.Array, dict[str, Union[numpy.ndarray, torch.Tensor]]])
- Return type:
Union[None, torch.Tensor, dict[str, torch.Tensor]]
- add_to_state_dict_registry(object)[source]#
- Parameters:
object (Union[mani_skill.utils.structs.Actor, mani_skill.utils.structs.Articulation])
- close()[source]#
After the user has finished using the environment, close contains the code necessary to “clean up” the environment.
This is critical for closing rendering windows, database or HTTP connections. Calling
closeon an already closed environment has no effect and won’t raise an error.
- abstractmethod compute_dense_reward(obs, action, info)[source]#
Compute the dense reward.
- Parameters:
obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)
action (torch.Tensor) – The most recent action.
info (dict) – The info dictionary.
- abstractmethod compute_normalized_dense_reward(obs, action, info)[source]#
Compute the normalized dense reward.
- Parameters:
obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)
action (torch.Tensor) – The most recent action.
info (dict) – The info dictionary.
- compute_sparse_reward(obs, action, info)[source]#
Computes the sparse reward. By default this function tries to use the success/fail information in returned by the evaluate function and gives +1 if success, -1 if fail, 0 otherwise.
- Parameters:
obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)
action (torch.Tensor) – The most recent action.
info (dict) – The info dictionary.
- evaluate()[source]#
Evaluate whether the environment is currently in a success state by returning a dictionary with a “success” key or a failure state via a “fail” key
This function may also return additional data that has been computed (e.g. is the robot grasping some object) that may be reused when generating observations and rewards.
By default if not overriden this function returns an empty dictionary
- Return type:
dict
- get_info()[source]#
Get info about the current environment state, include elapsed steps and evaluation information
- Return type:
dict
- get_obs(info=None, unflattened=False)[source]#
Return the current observation of the environment. User may call this directly to get the current observation as opposed to taking a step with actions in the environment.
Note that some tasks use info of the current environment state to populate the observations to avoid having to compute slow operations twice. For example a state based observation may wish to include a boolean indicating if a robot is grasping an object. Computing this boolean correctly is slow, so it is preferable to generate that data in the info object by overriding the self.evaluate function.
- Parameters:
info (dict) – The info object of the environment. Generally should always be the result of self.get_info(). If this is None (the default), this function will call self.get_info() itself
unflattened (bool) – Whether to return the observation without flattening even if the observation mode (self.obs_mode) asserts to return a flattened observation.
- get_reward(obs, action, info)[source]#
Compute the reward for environment at its current state. observation data, the most recent action, and the info dictionary (generated by the self.evaluate() function) are provided as inputs. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)
- Parameters:
obs (Any) – The observation data.
action (torch.Tensor) – The most recent action.
info (dict) – The info dictionary.
- get_sensor_images()[source]#
Get image (RGB) visualizations of what sensors currently sense. This function calls self._get_obs_sensor_data() internally which automatically hides objects and updates the render
- Return type:
dict[str, dict[str, torch.Tensor]]
- get_sensor_params()[source]#
Get all sensor parameters.
- Return type:
dict[str, dict[str, torch.Tensor]]
- get_state()[source]#
Get environment state as a flat vector, which is just a ordered flattened version of the state_dict.
Users should not override this function
- get_state_dict()[source]#
Get environment state dictionary. Override to include task information (e.g., goal)
- print_sim_details()[source]#
Debug tool to call to simply print a bunch of details about the running environment, including the task ID, number of environments, sim backend, etc.
- remove_from_state_dict_registry(object)[source]#
- Parameters:
object (Union[mani_skill.utils.structs.Actor, mani_skill.utils.structs.Articulation])
- render()[source]#
Either opens a viewer if
self.render_modeis “human”, or returns an array that you can use to save videos.If
self.render_modeis “rgb_array”, usually a higher quality image is rendered for the purpose of viewing only.If
self.render_modeis “sensors”, all visual observations the agent can see is providedIf
self.render_modeis “all”, this is then a combination of “rgb_array” and “sensors”
- render_human()[source]#
render the environment by opening a GUI viewer. This also returns the viewer object. Any objects registered in the _hidden_objects list will be shown
- render_rgb_array(camera_name=None)[source]#
Returns an RGB array / image of size (num_envs, H, W, 3) of the current state of the environment. This is captured by any of the registered human render cameras. If a camera_name is given, only data from that camera is returned. Otherwise all camera data is captured and returned as a single batched image. Any objects registered in the _hidden_objects list will be shown
- Parameters:
camera_name (Optional[str])
- render_sensors()[source]#
Renders all sensors that the agent can use and see and displays them in a human readable image format. Any objects registered in the _hidden_objects list will not be shown
- reset(seed=None, options=None)[source]#
Reset the ManiSkill environment with given seed(s) and options. Typically seed is either None (for unseeded reset) or an int (seeded reset). For GPU parallelized environments you can also pass a list of seeds for each parallel environment to seed each one separately.
If options[“env_idx”] is given, will only reset the selected parallel environments. If options[“reconfigure”] is True, will call self._reconfigure() which deletes the entire physx scene and reconstructs everything. Users building custom tasks generally do not need to override this function.
If options[“reset_to_env_states”] is given, we expect there to be options[“reset_to_env_states”][“env_states”] and optionally options[“reset_to_env_states”][“obs”], both with batch size equal to the number of environments being reset. “env_states” can be a dictionary or flat tensor and we skip calling the environment’s _initialize_episode function which generates the initial state on a normal reset. If “obs” is given we skip calling the environment’s get_obs function which can save some compute/time.
Returns the observations and an info dictionary. The info dictionary is of type
{ "reconfigure": bool # (True if the env reconfigured. False otherwise) }
Note that ManiSkill always holds two RNG states, a main RNG, and an episode RNG. The main RNG is used purely to sample an episode seed which helps with reproducibility of episodes and is for internal use only. The episode RNG is used by the environment/task itself to e.g. randomize object positions, randomize assets etc. Episode RNG is accessible by using self._batched_episode_rng which is numpy based and torch.rand which can be used to generate random data on the GPU directly and is seeded. Note that it is recommended to use self._batched_episode_rng if you need to ensure during reconfiguration the same objects are loaded. Reproducibility and seeding when there is GPU and CPU simulation can be tricky and we recommend reading the documentation for more recommendations and details on RNG https://maniskill.readthedocs.io/en/latest/user_guide/concepts/rng.html
Upon environment creation via gym.make, the main RNG is set with fixed seeds of 2022 to 2022 + num_envs - 1 (seed is just 2022 if there is only one environment) During each reset call, if seed is None, main RNG is unchanged and an episode seed is sampled from the main RNG to create the episode RNG. If seed is not None, main RNG is set to that seed and the episode seed is also set to that seed. This design means the main RNG determines the episode RNG deterministically.
- Parameters:
seed (Union[None, int, list[int]])
options (Union[None, dict])
- Return type:
tuple[Any, dict]
- set_state(state, env_idx=None)[source]#
Set environment state with a flat state vector. Internally this reconstructs the state dictionary and calls env.set_state_dict
Users should not override this function
- Parameters:
state (torch.Tensor)
env_idx (Optional[torch.Tensor])
- set_state_dict(state, env_idx=None)[source]#
Set environment state with a state dictionary. Override to include task information (e.g., goal)
Note that it is recommended to keep around state dictionaries as opposed to state vectors. With state vectors we assume the order of data in the vector is the same exact order that would be returned by flattening the state dictionary you get from env.get_state_dict() or the result of env.get_state()
- Parameters:
state (dict)
env_idx (Optional[torch.Tensor])
- step(action)[source]#
Take a step through the environment with an action. Actions are automatically clipped to the action space.
If
actionis None, the environment will proceed forward in time without sending any actions/control signals to the agent- Parameters:
action (Union[None, numpy.ndarray, torch.Tensor, dict])
- update_obs_space(obs)[source]#
A convenient function to auto generate observation spaces if you modify them. Call this function if you modify the observations returned by env.step and env.reset via an observation wrapper.
The recommended way to use this is in a observation wrapper is as so
import gymnasium as gym from mani_skill.envs.sapien_env import BaseEnv class YourObservationWrapper(gym.ObservationWrapper): def __init__(self, env): super().__init__(env) self.base_env.update_obs_space(self.observation(self.base_env._init_raw_obs)) @property def base_env(self) -> BaseEnv: return self.env.unwrapped def observation(self, obs): # your code for transforming the observation
- Parameters:
obs (torch.Tensor)
- SUPPORTED_OBS_MODES = ('state', 'state_dict', 'none', 'sensor_data', 'any_textures', 'pointcloud')[source]#
The string observation modes the environment supports. Note that “none” and “any_texture” are special keys. none indicates no observation data is generated. “any_texture” indicates that any combination of image textures generated by cameras are supported e.g. rgb+depth, normal+segmentation, albedo+rgb+depth etc. For a full list of supported textures see
- SUPPORTED_RENDER_MODES = ('human', 'rgb_array', 'sensors', 'all')[source]#
The supported render modes. Human opens up a GUI viewer. rgb_array returns an rgb array showing the current environment state. sensors returns an rgb array but only showing all data collected by sensors as images put together
- SUPPORTED_ROBOTS: list[str | Tuple[str]] | None = None[source]#
Override this to enforce which robots or tuples of robots together are supported in the task. During env creation, setting robot_uids auto loads all desired robots into the scene, but not all tasks are designed to support some robot setups
- _agent_sensor_configs: dict[str, mani_skill.sensors.base_sensor.BaseSensorConfig][source]#
all agent sensor configs parsed from agent._sensor_configs
- _batched_episode_rng: mani_skill.envs.utils.randomization.batched_rng.BatchedRNG[source]#
the recommended batched episode RNG to generate random numpy data consistently between single and parallel environments
- _batched_main_rng: mani_skill.envs.utils.randomization.batched_rng.BatchedRNG[source]#
the batched main RNG that generates episode seed sequences. For internal use only
- property _default_human_render_camera_configs: mani_skill.sensors.camera.CameraConfig | Sequence[mani_skill.sensors.camera.CameraConfig] | dict[str, mani_skill.sensors.camera.CameraConfig][source]#
Add default cameras for rendering when using render_mode=’rgb_array’. These can be overriden by the user at env creation time
- Return type:
Union[mani_skill.sensors.camera.CameraConfig, Sequence[mani_skill.sensors.camera.CameraConfig], dict[str, mani_skill.sensors.camera.CameraConfig]]
- property _default_sensor_configs: mani_skill.sensors.base_sensor.BaseSensorConfig | Sequence[mani_skill.sensors.base_sensor.BaseSensorConfig] | dict[str, mani_skill.sensors.base_sensor.BaseSensorConfig][source]#
Add default (non-agent) sensors to the environment by returning sensor configurations. These can be overriden by the user at env creation time
- Return type:
Union[mani_skill.sensors.base_sensor.BaseSensorConfig, Sequence[mani_skill.sensors.base_sensor.BaseSensorConfig], dict[str, mani_skill.sensors.base_sensor.BaseSensorConfig]]
- property _default_viewer_camera_configs: mani_skill.sensors.camera.CameraConfig[source]#
Default configuration for the viewer camera, controlling shader, fov, etc. By default if there is a human render camera called “render_camera” then the viewer will use that camera’s pose.
- Return type:
- _enhanced_determinism: bool = False[source]#
whether to reset the episode RNG upon each reset regardless of whether a seed is provided
- _episode_rng: numpy.random.RandomState[source]#
the numpy RNG that you can use to generate random numpy data. It is not recommended to use this. Instead use the _batched_episode_rng which helps ensure GPU and CPU simulation generate the same data with the same seeds.
- _episode_seed: numpy.ndarray[source]#
episode seed list for _episode_rng and _batched_episode_rng. _episode_rng uses _episode_seed[0].
list of objects that are hidden during rendering when generating visual observations / running render_cameras()
- _human_render_camera_configs: dict[str, mani_skill.sensors.camera.CameraConfig][source]#
all camera configurations parsed from self._human_render_camera_configs
- _human_render_cameras: dict[str, mani_skill.sensors.camera.Camera][source]#
cameras used for rendering the current environment retrievable via env.render_rgb_array(). These are not used to generate observations
- _init_raw_obs = None[source]#
the raw observation returned by the env.reset (a cpu torch tensor/dict of tensors). Useful for future observation wrappers to use to auto generate observation spaces
- _init_raw_state = None[source]#
the initial raw state returned by env.get_state. Useful for reconstructing state dictionaries from flattened state vectors
- _main_rng: numpy.random.RandomState[source]#
main rng generator that generates episode seed sequences. For internal use only
- _main_seed: numpy.ndarray[Any][source]#
main seed list for _main_rng and _batched_main_rng. _main_rng uses _main_seed[0]. For internal use only
- _parallel_in_single_scene: bool = False[source]#
whether all objects are placed in one scene for the purpose of rendering all objects together instead of in parallel
- _sample_video_link: str | None = None[source]#
a link to a sample video of the task. This is mostly used for automatic documentation generation
- _sensor_configs: dict[str, mani_skill.sensors.base_sensor.BaseSensorConfig][source]#
all sensor configurations parsed from self._sensor_configs and agent._sensor_configs
- _sensors: dict[str, mani_skill.sensors.base_sensor.BaseSensor][source]#
all sensors configured in this environment
- action_space: gymnasium.Space[source]#
the batched action space of the environment, which is also the action space of the agent
- property elapsed_steps: torch.Tensor[source]#
The number of steps that have elapsed in the environment
- Return type:
torch.Tensor
- property obs_mode: str[source]#
The current observation mode. This affects the observation returned by env.get_obs()
- Return type:
str
- obs_mode_struct[source]#
dataclass describing what observation data is being requested by the user, detailing if state data is requested and what visual data is requested
- property observation_space: gymnasium.Space[source]#
the batched observation space of the environment
- Return type:
gymnasium.Space
- property robot_link_names[source]#
Get link ids for the robot. This is used for segmentation observations.
- scene: mani_skill.envs.scene.ManiSkillScene[source]#
the main scene, which manages all sub scenes. In CPU simulation there is only one sub-scene
- property segmentation_id_map[source]#
Returns a dictionary mapping every ID to the appropriate Actor or Link object