Control Tasks#
These are classic control tasks where the objective is to control a robot to reach a particular state, similar to the DM Control suite but with GPU parallelized simulation and rendering. The document here has both a high-level overview/list of all tasks in a table as well as detailed task cards with video demonstrations after.
Task Table#
Table of all tasks/environments in this category. Task column is the environment ID, Preview is a thumbnail pair of the first and last frames of an example success demonstration. Max steps is the task’s default max episode steps, generally tuned for RL workflows.
Task |
Preview |
Dense Reward |
Success/Fail Conditions |
Demos |
Max Episode Steps |
|---|---|---|---|---|---|
✅ |
❌ |
❌ |
1000 |
||
✅ |
❌ |
❌ |
1000 |
||
✅ |
✅ |
❌ |
1000 |
||
✅ |
❌ |
❌ |
1000 |
||
✅ |
✅ |
❌ |
600 |
||
✅ |
✅ |
❌ |
600 |
||
✅ |
❌ |
❌ |
1000 |
||
✅ |
❌ |
❌ |
1000 |
||
✅ |
❌ |
❌ |
1000 |
MS-AntRun-v1#
Task Card
Task Description: Ant moves in x direction at 4 m/s
Randomizations:
Ant qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]
Success Conditions:
No specific success conditions.
MS-AntWalk-v1#
Task Card
Task Description: Ant moves in x direction at 0.5 m/s
Randomizations:
Ant qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]
Success Conditions:
No specific success conditions.
MS-CartpoleBalance-v1#
Task Card
Task Description: Use the Cartpole robot to balance a pole on a cart.
Randomizations:
Pole direction is randomized around the vertical axis. the range is [-0.05, 0.05] radians.
Fail Conditions:
Pole is lower than the horizontal plane
MS-CartpoleSwingUp-v1#
Task Card
Task Description: Use the Cartpole robot to swing up a pole on a cart.
Randomizations:
Pole direction is randomized around the whole circle. the range is [-pi, pi] radians.
Success Conditions:
No specific success conditions. The task is considered successful if the pole is upright for the whole episode.
MS-HopperHop-v1#
Task Card
Task Description: Hopper robot stays upright and moves in positive x direction with hopping motion
Randomizations:
Hopper robot is randomly rotated [-pi, pi] radians about y axis.
Hopper qpos are uniformly sampled within their allowed ranges
Success Conditions:
No specific success conditions. The task is considered successful if the pole is upright for the whole episode.
MS-HopperStand-v1#
Task Card
Task Description: Hopper robot stands upright
Randomizations:
Hopper robot is randomly rotated [-pi, pi] radians about y axis.
Hopper qpos are uniformly sampled within their allowed ranges
Success Conditions:
No specific success conditions.
MS-HumanoidRun-v1#
Task Card
Task Description: Humanoid moves in x direction at running pace
Randomizations:
Humanoid qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]
Fail Conditions:
Humanoid robot torso link leaves z range [0.7, 1.0]
MS-HumanoidStand-v1#
Task Card
Task Description: Humanoid robot stands upright
Randomizations:
Humanoid robot is randomly rotated [-pi, pi] radians about z axis.
Humanoid qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]
Fail Conditions:
Humanoid robot torso link leaves z range [0.7, 1.0]
MS-HumanoidWalk-v1#
Task Card
Task Description: Humanoid moves in x direction at walking pace
Randomizations:
Humanoid qpos and qvel have added noise from uniform distribution [-1e-2, 1e-2]
Fail Conditions:
Humanoid robot torso link leaves z range [0.7, 1.0]