mani_skill.envs.tasks.control#

Submodules#

Classes#

AntRun

Task Description:

AntWalk

Task Description:

CartpoleBalanceEnv

Task Description:

CartpoleSwingUpEnv

Task Description:

HopperHopEnv

Task Description:

HopperStandEnv

Task Description:

HumanoidRun

Task Description:

HumanoidStand

Task Description:

HumanoidWalk

Task Description:

Package Contents#

class mani_skill.envs.tasks.control.AntRun(*args, **kwargs)[source]#

Bases: AntEnv

Task Description: Ant moves in x direction at 4 m/s

Randomizations: - Ant qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]

Success Conditions: - No specific success conditions.

class mani_skill.envs.tasks.control.AntWalk(*args, **kwargs)[source]#

Bases: AntEnv

Task Description: Ant moves in x direction at 0.5 m/s

Randomizations: - Ant qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]

Success Conditions: - No specific success conditions.

class mani_skill.envs.tasks.control.CartpoleBalanceEnv(*args, **kwargs)[source]#

Bases: CartpoleEnv

Task Description: Use the Cartpole robot to balance a pole on a cart.

Randomizations: - Pole direction is randomized around the vertical axis. the range is [-0.05, 0.05] radians.

Fail Conditions: - Pole is lower than the horizontal plane

_initialize_episode(env_idx, options)[source]#

Initialize the episode, e.g., poses of actors and articulations, as well as task relevant data like randomizing goal positions

Parameters:
  • env_idx (torch.Tensor) –

  • options (dict) –

evaluate()[source]#

Evaluate whether the environment is currently in a success state by returning a dictionary with a “success” key or a failure state via a “fail” key

This function may also return additional data that has been computed (e.g. is the robot grasping some object) that may be reused when generating observations and rewards.

By default if not overriden this function returns an empty dictionary

a link to a sample video of the task. This is mostly used for automatic documentation generation

class mani_skill.envs.tasks.control.CartpoleSwingUpEnv(*args, **kwargs)[source]#

Bases: CartpoleEnv

Task Description: Use the Cartpole robot to swing up a pole on a cart.

Randomizations: - Pole direction is randomized around the whole circle. the range is [-pi, pi] radians.

Success Conditions: - No specific success conditions. The task is considered successful if the pole is upright for the whole episode.

_initialize_episode(env_idx, options)[source]#

Initialize the episode, e.g., poses of actors and articulations, as well as task relevant data like randomizing goal positions

Parameters:
  • env_idx (torch.Tensor) –

  • options (dict) –

SUPPORTED_REWARD_MODES = ('normalized_dense', 'dense', 'none')#
class mani_skill.envs.tasks.control.HopperHopEnv(*args, **kwargs)[source]#

Bases: HopperEnv

Task Description: Hopper robot stays upright and moves in positive x direction with hopping motion

Randomizations: - Hopper robot is randomly rotated [-pi, pi] radians about y axis. - Hopper qpos are uniformly sampled within their allowed ranges

Success Conditions: - No specific success conditions. The task is considered successful if the hopper hops for the whole episode.

compute_dense_reward(obs, action, info)[source]#

Compute the dense reward.

Parameters:
  • obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)

  • action (torch.Tensor) – The most recent action.

  • info (dict) – The info dictionary.

compute_normalized_dense_reward(obs, action, info)[source]#

Compute the normalized dense reward.

Parameters:
  • obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)

  • action (torch.Tensor) – The most recent action.

  • info (dict) – The info dictionary.

class mani_skill.envs.tasks.control.HopperStandEnv(*args, **kwargs)[source]#

Bases: HopperEnv

Task Description: Hopper robot stands upright

Randomizations: - Hopper robot is randomly rotated [-pi, pi] radians about y axis. - Hopper qpos are uniformly sampled within their allowed ranges

Success Conditions: - No specific success conditions.

compute_dense_reward(obs, action, info)[source]#

Compute the dense reward.

Parameters:
  • obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)

  • action (torch.Tensor) – The most recent action.

  • info (dict) – The info dictionary.

compute_normalized_dense_reward(obs, action, info)[source]#

Compute the normalized dense reward.

Parameters:
  • obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)

  • action (torch.Tensor) – The most recent action.

  • info (dict) – The info dictionary.

class mani_skill.envs.tasks.control.HumanoidRun(*args, robot_uids='humanoid', **kwargs)[source]#

Bases: HumanoidEnvStandard

Task Description: Humanoid moves in x direction at running pace

Randomizations: - Humanoid qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]

Fail Conditions: - Humanoid robot torso link leaves z range [0.7, 1.0]

compute_normalized_dense_reward(obs, action, info)[source]#

Compute the normalized dense reward.

Parameters:
  • obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)

  • action (torch.Tensor) – The most recent action.

  • info (dict) – The info dictionary.

agent: mani_skill.agents.robots.humanoid.Humanoid#
class mani_skill.envs.tasks.control.HumanoidStand(*args, robot_uids='humanoid', **kwargs)[source]#

Bases: HumanoidEnvStandard

Task Description: Humanoid robot stands upright

Randomizations: - Humanoid robot is randomly rotated [-pi, pi] radians about z axis. - Humanoid qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]

Fail Conditions: - Humanoid robot torso link leaves z range [0.7, 1.0]

_get_obs_state_dict(info)[source]#

Get (ground-truth) state-based observations.

Parameters:

info (dict) –

_initialize_episode(env_idx, options)[source]#

Initialize the episode, e.g., poses of actors and articulations, as well as task relevant data like randomizing goal positions

Parameters:
  • env_idx (torch.Tensor) –

  • options (dict) –

compute_normalized_dense_reward(obs, action, info)[source]#

Compute the normalized dense reward.

Parameters:
  • obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)

  • action (torch.Tensor) – The most recent action.

  • info (dict) – The info dictionary.

agent: mani_skill.agents.robots.humanoid.Humanoid#
class mani_skill.envs.tasks.control.HumanoidWalk(*args, robot_uids='humanoid', **kwargs)[source]#

Bases: HumanoidEnvStandard

Task Description: Humanoid moves in x direction at walking pace

Randomizations: - Humanoid qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]

Fail Conditions: - Humanoid robot torso link leaves z range [0.7, 1.0]

compute_normalized_dense_reward(obs, action, info)[source]#

Compute the normalized dense reward.

Parameters:
  • obs (Any) – The observation data. By default the observation data will be in its most raw form, a dictionary (no flattening, wrappers etc.)

  • action (torch.Tensor) – The most recent action.

  • info (dict) – The info dictionary.

agent: mani_skill.agents.robots.humanoid.Humanoid#