Observations
Observations list each measurable quantity the environment can return, the configuration options for specifying which asset or sensor to observe, and a description of what the measurement represents.Observation Function | Configurable Options | Description |
---|---|---|
base_pos_z | asset_cfg.name | Root height in the simulation world frame. |
base_lin_vel | asset_cfg.name | Root linear velocity in the asset’s root frame. |
base_ang_vel | asset_cfg.name | Root angular velocity in the asset’s root frame. |
projected_gravity | asset_cfg.name | Gravity projection on the asset’s root frame. |
root_pos_w | asset_cfg.name | Asset root position in the environment frame. |
root_quat_w | make_quat_unique (bool ), asset_cfg.name | Asset root orientation (w, x, y, z) in the environment frame. |
root_lin_vel_w | asset_cfg.name | Asset root linear velocity in the environment frame. |
root_ang_vel_w | asset_cfg.name | Asset root angular velocity in the environment frame. |
body_pose_w | asset_cfg.name, body_ids (list[int] ) | Flattened body poses of the asset w.r.t. environment origin (x, y, z, qw, qx, qy, qz) for each configured body. |
projected_gravity | asset_cfg.name, body_ids (list[int] ) | Direction of gravity projected onto each body frame (x, y, z) for configured bodies. |
joint_pos | asset_cfg.name | Joint positions of the asset. |
joint_pos_rel | asset_cfg.name | Joint positions relative to default joint positions. |
joint_pos_limit_normalized | asset_cfg.name | Joint positions normalized by the asset’s soft position limits. |
joint_vel | asset_cfg.name | Joint velocities of the asset. |
joint_vel_rel | asset_cfg.name | Joint velocities relative to default joint velocities. |
joint_effort | asset_cfg.name | Joint applied effort (torque) for configured joints. |
height_scan | sensor_cfg.name, offset (float) | Height scan from the sensor’s frame minus the given offset. |
body_incoming_wrench | asset_cfg.name, body_ids (list[int] ) | 6-D spatial wrench (force, torque) applied to each configured body link by joint forces. |
imu_orientation | asset_cfg.name | IMU sensor orientation in world frame as quaternion (w, x, y, z). |
imu_ang_vel | asset_cfg.name | IMU sensor angular velocity (rad/s) in sensor frame. |
imu_lin_acc | asset_cfg.name | IMU sensor linear acceleration (m/s²) in sensor frame. |
image | sensor_cfg.name, data_type (str ), convert_perspective_to_orthogonal (bool ), normalize (bool ) | Images from a camera or ray-caster camera (e.g., “rgb”, “depth”). Optionally normalize and orthogonalize depth. |
image_features | sensor_cfg.name, data_type (str ), model_zoo_cfg (dict ), model_name (str ), model_device (str ) | Extracted image features via a pretrained encoder (e.g., ResNet, Theia). |
last_action | action_name (str or None) | The last input action term (or full action tensor if None). |
velocity_commands | params.command_name (“base_velocity”) | Generated command tensor for base‐velocity commands. |
ee_pose_commands | params.command_name (“ee_pose”) | Generated command tensor for end‐effector pose commands. |
base_pose_commands | params.command_name (“base_pose”) | Generated command tensor for base‐pose commands. |
target_object_position_in_robot_root_frame | params.command_name (“object_pose”) | Generated command tensor for the target object’s pose in the robot’s root frame. |
Robot-Specific Observation Terms
Observation Term | Robot | Configurable Options | Description |
---|---|---|---|
joint_pos_rel_digit | Agility Digit | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Digit-group joint positions relative to defaults (regex match). |
joint_vel_rel_digit | Agility Digit | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Digit-group joint velocities relative to defaults. |
joint_pos_rel_left_arm | Unitree G1 (27 DOF) | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Left-arm joint positions relative to defaults (explicit list). |
joint_vel_rel_left_arm | Unitree G1 (27 DOF) | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Left-arm joint velocities relative to defaults. |
Commands
Commands define the functions available to sample or set asset poses or velocities, along with the configurable parameters that control how those commands are generated or applied.Command Function | Configurable Options | Description |
---|---|---|
UniformPose2dCommand | asset_name, cfg.ranges.pos_x (tuple ), cfg.ranges.pos_y (tuple ), cfg.ranges.heading (tuple ), cfg.simple_heading (bool ), cfg.resampling_time_range (tuple ) | Sample a random 2D position around the environment origin and a heading (either toward target or uniform). |
TerrainBasedPose2dCommand | asset_name, cfg.ranges.heading (tuple ), cfg.simple_heading (bool ), cfg.resampling_time_range (tuple ) | Sample a random 2D position from valid terrain patches and a heading (either toward target or uniform). |
UniformPoseCommand | asset_name, body_name, cfg.ranges.pos_x (tuple ), cfg.ranges.pos_y (tuple ), cfg.ranges.pos_z (tuple ), cfg.ranges.roll (tuple ), cfg.ranges.pitch (tuple ), cfg.ranges.yaw (tuple ), cfg.make_quat_unique (bool ), cfg.resampling_time_range (tuple ) | Sample a random 3D pose (position + quaternion) in the robot’s base frame uniformly within specified ranges. |
UniformVelocityCommand | asset_name, cfg.ranges.lin_vel_x (tuple ), cfg.ranges.lin_vel_y (tuple ), cfg.ranges.ang_vel_z (tuple ), cfg.heading_command (bool ), cfg.ranges.heading (tuple ), cfg.rel_heading_envs (float ), cfg.rel_standing_envs (float ), cfg.resampling_time_range (tuple ) | Sample a random base-frame linear (x,y) and yaw velocity (either uniform or based on heading error), with optional standing/no-motion environments. |
NormalVelocityCommand | asset_name, cfg.ranges.mean_vel (tuple of 3), cfg.ranges.std_vel (tuple of 3), cfg.ranges.zero_prob (tuple of 3), cfg.rel_standing_envs (float ), cfg.resampling_time_range (tuple ) | Sample a base-frame velocity command from a normal distribution (with element-wise zeroing and optional standing). |
Actions
Actions enumerate the terms that transform high-level input vectors into specific joint or base control targets — such as positions, velocities, or torques—along with their associated configuration options.Action Term | Configurable Options | One-line Meaning |
---|---|---|
BinaryJointPositionAction | cfg.joint_names (list of joint names), cfg.open_command_expr (string ), cfg.close_command_expr (string ) | Map a binary (open/close) action to joint position targets for specified joints. |
BinaryJointVelocityAction | cfg.joint_names (list of joint names), cfg.open_command_expr (string ), cfg.close_command_expr (string ) | Map a binary (open/close) action to joint velocity targets for specified joints. |
JointPositionToLimitsAction | cfg.joint_names (list ), cfg.scale (float or dict mapping joint→scale), cfg.rescale_to_limits (bool ) | Scale raw inputs and (optionally) rescale to joint position limits before applying as position targets. |
EMAJointPositionToLimitsAction | cfg.joint_names (list ), cfg.scale (float or dict ), cfg.rescale_to_limits (bool ), cfg.alpha (float or dict mapping joint→EMA weight) | Like JointPositionToLimitsAction, but applies exponential moving average (α) over processed position targets. |
JointPositionAction | cfg.joint_names (list ), cfg.scale (float or dict ), cfg.offset (float or dict ), cfg.use_default_offset (bool ) | Affinely transform input actions (scale + offset) and send as joint position targets. |
RelativeJointPositionAction | cfg.joint_names (list ), cfg.scale (float or dict ), cfg.offset (float or dict ), cfg.use_zero_offset (bool ) | Affinely transform input and add to current joint positions before sending as position commands. |
JointVelocityAction | cfg.joint_names (list ), cfg.scale (float or dict ), cfg.offset (float or dict ), cfg.use_default_offset (bool ) | Affinely transform input and send as joint velocity targets. |
JointEffortAction | cfg.joint_names (list ), cfg.scale (float or dict ), cfg.offset (float or dict ) | Affinely transform input and send as joint effort targets. |
NonHolonomicAction | cfg.x_joint_name (str ), cfg.y_joint_name (str ), cfg.yaw_joint_name (str ), cfg.body_name (str ), cfg.scale (tuple[2] ), cfg.offset (tuple[2] ) | Map a 2D action (forward speed, yaw rate) to dummy joint velocities (x, y, yaw) for a skid-steer base. |
DifferentialInverseKinematicsAction | cfg.joint_names (list ), cfg.body_name (str ), cfg.scale (list of floats ), cfg.controller (DifferentialIK config), cfg.body_offset (optional pos/rot tuples) | Apply scaled end-effector pose commands through a differential IK controller to produce joint position targets. |
PretrainedPolicyAction | cfg.asset_name (str ), cfg.policy_path (str ), cfg.low_level_decimation (int ), cfg.low_level_actions (ActionCfg), cfg.low_level_observations (ObservationCfg ) | Use a pretrained low-level policy (e.g. locomotion) to decode actions in a hierarchical setup (e.g. navigation on top). |
Rewards
Rewards describe the available functions for assigning scalar feedback to agent behaviors, detail any options required to configure them, and explain the physical or task-related criterion they encourage or penalize.Reward Function | Configurable Options | Description |
---|---|---|
is_alive | none | +1 if the episode is still running (not terminated). |
is_terminated | none | +1 if the episode just terminated (penalizes any termination). |
is_terminated_term | term_keys (string , list of strings or regex) | Sum of specified non-timeout termination terms. |
lin_vel_z_l2 | asset_cfg.name | Penalize squared z-axis base linear velocity. |
ang_vel_xy_l2 | asset_cfg.name | Penalize squared x/y components of base angular velocity. |
flat_orientation_l2 | asset_cfg.name | Penalize deviation from a flat base orientation (projected gravity). |
base_height_l2 | target_height (float ), asset_cfg.name | Penalize squared deviation of base height from target. |
body_lin_acc_l2 | asset_cfg.name | Penalize summed L2 norm of all body linear accelerations. |
joint_torques_l2 | asset_cfg.name | Penalize squared torques on configured joints. |
joint_vel_l1 | asset_cfg.name | Penalize absolute joint velocities (L1). |
joint_vel_l2 | asset_cfg.name | Penalize squared joint velocities. |
joint_acc_l2 | asset_cfg.name | Penalize squared joint accelerations. |
joint_deviation_l1 | asset_cfg.name | Penalize absolute deviation from default joint positions. |
joint_pos_limits | asset_cfg.name | Penalize violations of soft joint-position limits. |
joint_vel_limits | soft_ratio (float ), asset_cfg.name | Penalize violations of soft joint-velocity limits (clipped). |
applied_torque_limits | asset_cfg.name | Penalize mismatch between applied and computed torques. |
action_rate_l2 | none | Penalize squared change in actions from one step to the next. |
action_l2 | none | Penalize squared magnitude of the actions themselves. |
undesired_contacts | threshold (float ), sensor_cfg.name | Penalize count of contact events whose force exceeds threshold. |
contact_forces | threshold (float ), sensor_cfg.name | Penalize amount by which contact forces exceed threshold. |
track_lin_vel_xy_exp | std (float ), command_name (str ), asset_cfg.name | Reward (exp-kernel) how well base xy-velocity matches command. |
track_ang_vel_z_exp | std (float ), command_name (str ), asset_cfg.name | Reward (exp-kernel) how well yaw velocity matches command. |
air_time | mode_time (float ), velocity_threshold (float ), asset_cfg.name (str ), sensor_cfg.name (str ), sensor_cfg.body_names (regex) | Reward for limbs airborne above the velocity threshold. |
air_time_variance_penalty | sensor_cfg.name (str ), sensor_cfg.body_names (regex) | Penalize uneven airborne time across limbs. |
base_angular_velocity | std (float ), asset_cfg.name (str ) | Reward (Gaussian) on the magnitude of base angular velocity. |
base_linear_velocity | std (float ), ramp_rate (float ), ramp_at_vel (float ), asset_cfg.name (str ) | Reward (Gaussian with ramp) on base linear velocity. |
foot_clearance | std (float ), tanh_mult (float ), target_height (float ), asset_cfg.name (str ) | Reward for foot clearance above a target height. |
base_motion | asset_cfg.name (str ) | Penalize any base movement (encourages stationarity). |
base_orientation | asset_cfg.name (str ) | Penalize deviation from upright base orientation. |
foot_slip | asset_cfg.name (str ), sensor_cfg.name (str ), sensor_cfg.body_names (regex), threshold (float ) | Penalize foot-slip events when force exceeds the threshold. |
joint_pos | asset_cfg.name (str ), stand_still_scale (float ), velocity_threshold (float ) | Penalize deviation from reference joint positions when stationary. |
joint_torques | asset_cfg.name (str ) | Penalize high joint-torque usage on configured joints. |
termination_penalty | none | Large penalty applied when the episode terminates. |
Task-Specific Reward Terms
Reward Function | Configurable Options | Description | Task Name |
---|---|---|---|
position_tracking_error | std (float ), command_name (str ) | Reward (tanh) on base position tracking accuracy. | Navigation |
orientation_tracking_error | command_name (str ) | Penalize heading error relative to the commanded orientation. | Navigation |
object_ee_distance | std (float ) | Reward (Gaussian) on the distance between the end-effector and the object. | Manipulation Lift Object |
object_is_lifted | minimal_height (float ) | Reward (binary) for the object being lifted above the minimal height threshold. | Manipulation Lift Object |
object_goal_distance | std (float ), minimal_height (float ), command_name (str ) | Reward (Gaussian) on the object’s distance to the goal pose after lifting. | Manipulation Lift Object |
track_lin_vel_xy_exp | std (float ), command_name (str ), asset_cfg.name | Reward (exp-kernel) how well base xy-velocity matches command. | Velocity Tracking |
track_ang_vel_z_exp | std (float ), command_name (str ), asset_cfg.name | Reward (exp-kernel) how well yaw velocity matches command. | Velocity Tracking |
track_orientation_inv_l2 | object_cfg (SceneEntityCfg("object") ), rot_eps (float ), command_name (str ) | Reward (inverse-L2) encouraging alignment of object orientation with the commanded pose. | Manipulation In-Hand |
success_bonus_inhand | object_cfg (SceneEntityCfg("object") ), command_name (str ) | Bonus reward upon successful completion of the in-hand manipulation task. | Manipulation In-Hand |
Robot-Specific Reward Terms
Reward Function | Robot | Configurable Options | Description |
---|---|---|---|
left_wrist_end_effector_position_tracking | Unitree G1 (27 DOF) | asset_cfg.name, body_names, command_name, shift | Penalize deviation of the left-wrist end-effector position from the commanded target. |
left_wrist_end_effector_position_tracking_fine_grained | Unitree G1 (27 DOF) | asset_cfg.name, body_names, std, command_name, shift | Reward (tanh) on fine-grained left-wrist position tracking accuracy. |
left_wrist_end_effector_orientation_tracking | Unitree G1 (27 DOF) | asset_cfg.name, body_names, command_name, tol | Penalize orientation error of the left-wrist end-effector relative to the command. |
left_hand_joint_range_violation | Unitree G1 (27 DOF) | asset_cfg.name, joint_names, limits_min, limits_max | Penalize any left-hand joints outside the specified minimum/maximum limits. |
Events
Events list environment-level operations that introduce randomized or deterministic changes — such as randomizing physical properties or applying external forces—along with the parameters needed to specify how those changes are applied.Event Function | Configurable Options | Description |
---|---|---|
randomize_rigid_body_scale | env_ids, scale_range (tuple[float, float] or dict{x, y, z → tuple[float, float]} ), asset_cfg.name, relative_child_path (optional) | Randomize the USD-scale of a rigid‐body asset (or a specified child prim) before simulation starts. |
randomize_robot_body_material | env_ids, static_friction_range (tuple[float, float] ), dynamic_friction_range (tuple[float, float] ), restitution_range (tuple[float, float] ), num_buckets (int), asset_cfg.name, make_consistent (bool ) | Sample a set of friction/restitution “buckets” and assign them to each geometry of the specified robot Articulation. |
randomize_object_body_material | env_ids, static_friction_range (tuple[float, float] ), dynamic_friction_range (tuple[float, float] ), restitution_range (tuple[float, float] ), num_buckets (int), asset_cfg.name, make_consistent (bool ) | Sample a set of friction/restitution “buckets” and assign them to each geometry of the specified RigidObject. |
randomize_rigid_body_mass | env_ids, asset_cfg.name, mass_distribution_params (tuple[float, float] ), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”), recompute_inertia (bool ) | Randomize per‐body mass values (add/scale/overwrite) and optionally recompute inertia tensors from default mass. |
randomize_rigid_body_com | env_ids, com_range (dict{x, y, z → tuple[float, float]} ), asset_cfg.name | Add a random offset to the center of mass of each body in the specified Articulation. |
randomize_rigid_body_collider_offsets | env_ids, rest_offset_distribution_params (tuple[float, float] ), contact_offset_distribution_params (tuple[float, float] ), asset_cfg.name, distribution (“uniform”/“log_uniform”/“gaussian”) | Randomize physics collider “rest” and/or “contact” offsets for each body shape of the specified asset. |
randomize_physics_scene_gravity | env_ids, gravity_distribution_params (tuple[list[float], list[float]] ), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”) | Randomize the global gravity vector (x, y, z) for the entire physics scene (same gravity for all envs). |
randomize_actuator_gains | env_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float] ), damping_distribution_params (tuple[float, float] ), operation, distribution | Randomize stiffness and/or damping gains of actuators (implicit or explicit) on specified joint IDs. |
randomize_joint_parameters | env_ids, asset_cfg.name, friction_distribution_params (tuple[float, float] ), armature_distribution_params (tuple[float, float] ), lower_limit_distribution_params (tuple[float, float] ), upper_limit_distribution_params (tuple[float, float] ), operation, distribution | Randomize joint physics properties: friction, armature, and position limits (lower/upper) by adding/scaling/overwriting. |
randomize_fixed_tendon_parameters | env_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float] ), damping_distribution_params (tuple[float, float] ), limit_stiffness_distribution_params (tuple[float, float] ), lower_limit_distribution_params (tuple[float, float] ), upper_limit_distribution_params (tuple[float, float] ), rest_length_distribution_params (tuple[float, float] ), offset_distribution_params (tuple[float, float] ), operation, distribution | Randomize fixed‐tendon stiffness, damping, limit stiffness, lower/upper position limits, rest length, and/or offset. |
apply_external_force_torque | env_ids, force_range (tuple[float, float] ), torque_range (tuple[float, float] ), asset_cfg.name | Sample and buffer random forces and torques to apply to each body of the specified asset (effective on next write). |
push_by_setting_velocity | env_ids, velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]} ), asset_cfg.name | Sample a random root‐velocity (linear + angular) and set it directly on the specified asset to simulate a push. |
reset_base | env_ids, pose_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]} ), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]} ), asset_cfg.name | Randomize root pose of the robot (translation + rotation) uniformly around default and root velocity, then write to simulation. |
reset_root_state | env_ids, pose_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]} ), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]} ), asset_cfg.name | Randomize root pose (translation + rotation) of an object uniformly around default and root velocity, then write to simulation. |
reset_root_state_with_random_orientation | env_ids, pose_range (dict{x, y, z → tuple[float, float]} ), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]} ), asset_cfg.name | Randomize root position uniformly, sample orientation uniformly from SO(3), and set velocity within given ranges. |
reset_root_state_from_terrain | env_ids, pose_range (dict{roll, pitch, yaw → tuple[float, float]} ), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]} ), asset_cfg.name | Sample a valid terrain patch (“init_pos”), set root pose on terrain + random yaw/pitch/roll, then set velocity. |
reset_joints_by_scale | env_ids, position_range (tuple[float, float] ), velocity_range (tuple[float, float] ), asset_cfg.name | Scale default joint positions and velocities by random factors (within ranges) and clamp to soft limits before writing. |
reset_joints_by_offset | env_ids, position_range (tuple[float, float] ), velocity_range (tuple[float, float] ), asset_cfg.name | Offset default joint positions and velocities by random values (within ranges) and clamp to soft limits before writing. |
reset_nodal_state_uniform | env_ids, position_range (dict{x, y, z → tuple[float, float]} ), velocity_range (dict{x, y, z → tuple[float, float]} ), asset_cfg.name | Randomize default nodal (vertex) positions and velocities of a DeformableObject uniformly within given ranges. |
reset_scene_to_default | env_ids | Restore all rigid objects, articulations (root pose + joint state), and deformables (nodal state) to their default values. |
randomize_visual_texture_material | env_ids, event_name (str ), asset_cfg.name, texture_paths (list[str] ), texture_rotation (tuple[float, float] ) | Trigger a Replicator‐based event that randomizes textures on each body’s “/visuals” prim under the specified asset. |
randomize_visual_color | env_ids, event_name (str ), asset_cfg.name, colors (list[tuple[float, float, float]] or dict{r, g, b → tuple[float, float]} ), mesh_name (str ) | Trigger a Replicator‐based event that randomizes color on the specified mesh prims under the asset. |
Terminations
Terminations specify the conditions under which an episode ends, the configuration options that define threshold values or monitored signals, and a brief explanation of what triggers each termination.Termination Function | Configurable Options | Description |
---|---|---|
time_out | none | Terminate when episode length ≥ max_episode_length. |
command_resample | command_name (str ), num_resamples (int ) | Terminate when a command has been resampled the specified number of times. |
bad_orientation | limit_angle (float), asset_cfg.name | Terminate if the asset’s orientation deviates from vertical by > limit_angle. |
root_height_below_minimum | minimum_height (float ), asset_cfg.name | Terminate if the asset’s root height drops below the minimum height. |
joint_pos_out_of_limit | asset_cfg.name | Terminate if any joint position exceeds the soft position limits. |
joint_pos_out_of_manual_limit | bounds (tuple[min, max] ), asset_cfg.name | Terminate if any joint position falls outside the user-specified bounds. |
joint_vel_out_of_limit | asset_cfg.name | Terminate if any joint velocity exceeds the soft velocity limits. |
joint_vel_out_of_manual_limit | max_velocity (float ), asset_cfg.name | Terminate if any joint velocity exceeds the given max_velocity. |
joint_effort_out_of_limit | asset_cfg.name | Terminate if any joint’s computed torque equals its applied torque (indicating effort clipping). |
base_contact | threshold (float ), sensor_cfg.name | Terminate if any contact force at the sensor exceeds the threshold. |
Robot-Specific Terrmination Terms
Termination Function | Robot | Configurable Options | Description |
---|---|---|---|
base_contact_g1 | Unitree G1 (27 DOF) | sensor_cfg.name (str ), sensor_cfg.body_names (list[str] ), threshold (float ) | Terminate episode if any specified non-foot link (e.g. pelvis, torso, waist, shoulders) registers contact force above the threshold. |
Task-Specific Terrmination Terms
Termination Function | Configurable Options | Description | Task Name |
---|---|---|---|
object_out_of_reach | asset_cfg (SceneEntityCfg("object") ), threshold (float ) | Terminate when the object moves farther than the specified threshold distance from the robot. | Manipulation In-Hand |
max_consecutive_success | num_success (int ), command_name (str ) | Terminate once the in-hand manipulation has succeeded the given number of times in a row. | Manipulation In-Hand |