mani_skill.utils.wrappers.record#
Classes#
Record trajectories or videos for episodes. You generally should always apply this wrapper last, particularly if you include |
|
Functions#
|
Clean trajectories by renaming and pruning trajectories in place. |
|
|
|
Module Contents#
- class mani_skill.utils.wrappers.record.RecordEpisode(env, output_dir, save_trajectory=True, trajectory_name=None, save_video=True, info_on_video=False, save_on_reset=True, save_video_trigger=None, max_steps_per_video=None, clean_on_close=True, record_reward=True, record_env_state=True, video_fps=30, render_substeps=False, avoid_overwriting_video=False, source_type=None, source_desc=None)[source]#
Bases:
gymnasium.WrapperRecord trajectories or videos for episodes. You generally should always apply this wrapper last, particularly if you include observation wrappers which modify the returned observations. The only wrappers that may go after this one is any of the vector env interface wrappers that map the maniskill env to a e.g. gym vector env interface.
Trajectory data is saved with two files, the actual data in a .h5 file via H5py and metadata in a JSON file of the same basename.
Each JSON file contains:
env_info (dict): task (also known as environment) information, which can be used to initialize the task
env_id (str): task id
max_episode_steps (int)
env_kwargs (dict): keyword arguments to initialize the task. Essential to recreate the environment.
episodes (list[dict]): episode information
source_type (Optional[str]): a simple category string describing what process generated the trajectory data. ManiSkill official datasets will usually write one of “human”, “motionplanning”, or “rl” at the moment.
source_desc (Optional[str]): a longer explanation of how the data was generated.
The episode information (the element of episodes) includes:
episode_id (int): a unique id to index the episode
reset_kwargs (dict): keyword arguments to reset the task. Essential to reproduce the trajectory.
control_mode (str): control mode used for the episode.
elapsed_steps (int): trajectory length
info (dict): information at the end of the episode.
With just the meta data, you can reproduce the task the same way it was created when the trajectories were collected as so:
`python env = gym.make(env_info["env_id"], **env_info["env_kwargs"]) episode = env_info["episodes"][0] # picks the first env.reset(**episode["reset_kwargs"]) `Each HDF5 demonstration dataset consists of multiple trajectories. The key of each trajectory is traj_{episode_id}, e.g., traj_0.
Each trajectory is an h5py.Group, which contains:
actions: [T, A], np.float32. T is the number of transitions.
terminated: [T], np.bool_. It indicates whether the task is terminated or not at each time step.
truncated: [T], np.bool_. It indicates whether the task is truncated or not at each time step.
env_states: [T+1, D], np.float32. Environment states. It can be used to set the environment to a certain state via env.set_state_dict. However, it may not be enough to reproduce the trajectory.
success (optional): [T], np.bool_. It indicates whether the task is successful at each time step. Included if task defines success.
fail (optional): [T], np.bool_. It indicates whether the task is in a failure state at each time step. Included if task defines failure.
obs (optional): [T+1, D] observations.
Note that env_states is in a dictionary form (and observations may be as well depending on obs_mode), where it is formatted as a dictionary of lists. For example, a typical environment state looks like this:
```python env_state = env.get_state_dict() “”” env_state = { “actors”: {
“actor_id”: […numpy_actor_state…], …
}, “articulations”: {
“articulation_id”: […numpy_articulation_state…], …
}#
“””#
In the trajectory file env_states will be the same structure but each value/leaf in the dictionary will be a sequence of states representing the state of that particular entity in the simulation over time.
In practice it is may be more useful to use slices of the env_states data (or the observations data), which can be done with
`python import mani_skill.trajectory.utils as trajectory_utils env_states = trajectory_utils.dict_to_list_of_dicts(env_states) # now env_states[i] is the same as the data env.get_state_dict() returned at timestep i i = 10 env_state_i = trajectory_utils.index_dict(env_states, i) # now env_state_i is the same as the data env.get_state_dict() returned at timestep i `- param env:
the environment to record
- param output_dir:
output directory
- param save_trajectory:
whether to save trajectory
- param trajectory_name:
name of trajectory file (.h5). Use timestamp if not provided.
- param save_video:
whether to save video
- param info_on_video:
whether to write data about reward, action, and data in the info object to the video. The first video frame is generally the result of the first env.reset() (visualizing the first observation). Text is written on frames after that, showing the action taken to get to that environment state and reward.
- param save_on_reset:
whether to save the previous trajectory (and video of it if save_video is True) automatically when resetting. Not that for environments simulated on the GPU (to leverage fast parallel rendering) you must set max_steps_per_video to a fixed number so that every max_steps_per_video steps a video is saved. This is required as there may be partial environment resets which makes it ambiguous about how to save/cut videos.
- param save_video_trigger:
a function that takes the current number of elapsed environment steps and outputs a bool. If output is True, will start saving that timestep to the video.
- param max_steps_per_video:
how many steps can be recorded into a single video before flushing the video. If None this is not used. A internal step counter is maintained to do this. If the video is flushed at any point, the step counter is reset to 0.
- param clean_on_close:
whether to rename and prune trajectories when closed. See clean_trajectories for details.
- param record_reward:
whether to record the reward in the trajectory data
- param record_env_state:
whether to record the environment state in the trajectory data
- param video_fps:
The FPS of the video to generate if save_video is True
- type video_fps:
int
- param render_substeps:
Whether to render substeps for video. This is captures an image of the environment after each physics step. This runs slower but generates more image frames per environment step which when coupled with a higher video FPS can yield a smoother video.
- type render_substeps:
bool
- param avoid_overwriting_video:
If true, the wrapper will iterate over possible video names to avoid overwriting existing videos in the output directory. Useful for resuming training runs.
- type avoid_overwriting_video:
bool
- param source_type:
a word to describe the source of the actions used to record episodes (e.g. RL, motionplanning, teleoperation)
- type source_type:
Optional[str]
- param source_desc:
A longer description describing how the demonstrations are collected
- type source_desc:
Optional[str]
- flush_trajectory(verbose=False, ignore_empty_transition=True, env_idxs_to_flush=None, save=True)[source]#
Flushes a trajectory and by default saves it to disk
- Parameters:
verbose (bool) – whether to print out information about the flushed trajectory
ignore_empty_transition (bool) – whether to ignore trajectories that did not have any actions
env_idxs_to_flush – which environments by id to flush. If None, all environments are flushed.
save (bool) – whether to save the trajectory to disk
- flush_video(name=None, suffix='', verbose=False, ignore_empty_transition=True, save=True)[source]#
Flush a video of the recorded episode(s) anb by default saves it to disk
- Parameters:
name (str) – name of the video file. If None, it will be named with the episode id.
suffix (str) – suffix to add to the video file name
verbose (bool) – whether to print out information about the flushed video
ignore_empty_transition (bool) – whether to ignore trajectories that did not have any actions
save (bool) – whether to save the video to disk
- reset(*args, seed=None, options=None, save=True, **kwargs)[source]#
Uses the
reset()of theenvthat can be overwritten to change the returned data.- Parameters:
seed (Optional[Union[int, list[int]]]) –
options (Optional[dict]) –
- step(action)[source]#
Uses the
step()of theenvthat can be overwritten to change the returned data.
- property base_env: mani_skill.envs.sapien_env.BaseEnv[source]#
- Return type:
- Parameters:
output_dir (str) –
save_trajectory (bool) –
trajectory_name (Optional[str]) –
save_video (bool) –
info_on_video (bool) –
save_on_reset (bool) –
save_video_trigger (Optional[Callable[[int], bool]]) –
max_steps_per_video (Optional[int]) –
clean_on_close (bool) –
record_reward (bool) –
record_env_state (bool) –
video_fps (int) –
render_substeps (bool) –
avoid_overwriting_video (bool) –
source_type (Optional[str]) –
source_desc (Optional[str]) –
- class mani_skill.utils.wrappers.record.Step[source]#
- mani_skill.utils.wrappers.record.clean_trajectories(h5_file, json_dict, prune_empty_action=True)[source]#
Clean trajectories by renaming and pruning trajectories in place.
After cleanup, trajectory names are consecutive integers (traj_0, traj_1, …), and trajectories with empty action are pruned.
- Parameters:
h5_file (h5py.File) – raw h5 file
json_dict (dict) – raw JSON dict
prune_empty_action – whether to prune trajectories with empty action
- mani_skill.utils.wrappers.record.parse_env_info(env)[source]#
- Parameters:
env (gymnasium.Env) –