Observations
Observations list each measurable quantity the environment can return, the configuration options for specifying which asset or sensor to observe, and a description of what the measurement represents.| Observation Function | Configurable Options | Description |
|---|---|---|
| base_pos_z | asset_cfg.name | Root height in the simulation world frame. |
| base_lin_vel | asset_cfg.name | Root linear velocity in the asset’s root frame. |
| base_ang_vel | asset_cfg.name | Root angular velocity in the asset’s root frame. |
| projected_gravity | asset_cfg.name | Gravity projection on the asset’s root frame. |
| root_pos_w | asset_cfg.name | Asset root position in the environment frame. |
| root_quat_w | make_quat_unique (bool), asset_cfg.name | Asset root orientation (w, x, y, z) in the environment frame. |
| root_lin_vel_w | asset_cfg.name | Asset root linear velocity in the environment frame. |
| root_ang_vel_w | asset_cfg.name | Asset root angular velocity in the environment frame. |
| body_pose_w | asset_cfg.name, body_ids (list[int]) | Flattened body poses of the asset w.r.t. environment origin (x, y, z, qw, qx, qy, qz) for each configured body. |
| projected_gravity | asset_cfg.name, body_ids (list[int]) | Direction of gravity projected onto each body frame (x, y, z) for configured bodies. |
| joint_pos | asset_cfg.name | Joint positions of the asset. |
| joint_pos_rel | asset_cfg.name | Joint positions relative to default joint positions. |
| joint_pos_limit_normalized | asset_cfg.name | Joint positions normalized by the asset’s soft position limits. |
| joint_vel | asset_cfg.name | Joint velocities of the asset. |
| joint_vel_rel | asset_cfg.name | Joint velocities relative to default joint velocities. |
| joint_effort | asset_cfg.name | Joint applied effort (torque) for configured joints. |
| height_scan | sensor_cfg.name, offset (float) | Height scan from the sensor’s frame minus the given offset. |
| body_incoming_wrench | asset_cfg.name, body_ids (list[int]) | 6-D spatial wrench (force, torque) applied to each configured body link by joint forces. |
| imu_orientation | asset_cfg.name | IMU sensor orientation in world frame as quaternion (w, x, y, z). |
| imu_ang_vel | asset_cfg.name | IMU sensor angular velocity (rad/s) in sensor frame. |
| imu_lin_acc | asset_cfg.name | IMU sensor linear acceleration (m/s²) in sensor frame. |
| image | sensor_cfg.name, data_type (str), convert_perspective_to_orthogonal (bool), normalize (bool) | Images from a camera or ray-caster camera (e.g., “rgb”, “depth”). Optionally normalize and orthogonalize depth. |
| image_features | sensor_cfg.name, data_type (str), model_zoo_cfg (dict), model_name (str), model_device (str) | Extracted image features via a pretrained encoder (e.g., ResNet, Theia). |
| last_action | action_name (str or None) | The last input action term (or full action tensor if None). |
| velocity_commands | params.command_name (“base_velocity”) | Generated command tensor for base‐velocity commands. |
| ee_pose_commands | params.command_name (“ee_pose”) | Generated command tensor for end‐effector pose commands. |
| base_pose_commands | params.command_name (“base_pose”) | Generated command tensor for base‐pose commands. |
| target_object_position_in_robot_root_frame | params.command_name (“object_pose”) | Generated command tensor for the target object’s pose in the robot’s root frame. |
Robot-Specific Observation Terms
| Observation Term | Robot | Configurable Options | Description |
|---|---|---|---|
| joint_pos_rel_digit | Agility Digit | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Digit-group joint positions relative to defaults (regex match). |
| joint_vel_rel_digit | Agility Digit | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Digit-group joint velocities relative to defaults. |
| joint_pos_rel_left_arm | Unitree G1 (27 DOF) | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Left-arm joint positions relative to defaults (explicit list). |
| joint_vel_rel_left_arm | Unitree G1 (27 DOF) | noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_names | Left-arm joint velocities relative to defaults. |
Commands
Commands define the functions available to sample or set asset poses or velocities, along with the configurable parameters that control how those commands are generated or applied.| Command Function | Configurable Options | Description |
|---|---|---|
| UniformPose2dCommand | asset_name, cfg.ranges.pos_x (tuple), cfg.ranges.pos_y (tuple), cfg.ranges.heading (tuple), cfg.simple_heading (bool), cfg.resampling_time_range (tuple) | Sample a random 2D position around the environment origin and a heading (either toward target or uniform). |
| TerrainBasedPose2dCommand | asset_name, cfg.ranges.heading (tuple), cfg.simple_heading (bool), cfg.resampling_time_range (tuple) | Sample a random 2D position from valid terrain patches and a heading (either toward target or uniform). |
| UniformPoseCommand | asset_name, body_name, cfg.ranges.pos_x (tuple), cfg.ranges.pos_y (tuple), cfg.ranges.pos_z (tuple), cfg.ranges.roll (tuple), cfg.ranges.pitch (tuple), cfg.ranges.yaw (tuple), cfg.make_quat_unique (bool), cfg.resampling_time_range (tuple) | Sample a random 3D pose (position + quaternion) in the robot’s base frame uniformly within specified ranges. |
| UniformVelocityCommand | asset_name, cfg.ranges.lin_vel_x (tuple), cfg.ranges.lin_vel_y (tuple), cfg.ranges.ang_vel_z (tuple), cfg.heading_command (bool), cfg.ranges.heading (tuple), cfg.rel_heading_envs (float), cfg.rel_standing_envs (float), cfg.resampling_time_range (tuple) | Sample a random base-frame linear (x,y) and yaw velocity (either uniform or based on heading error), with optional standing/no-motion environments. |
| NormalVelocityCommand | asset_name, cfg.ranges.mean_vel (tuple of 3), cfg.ranges.std_vel (tuple of 3), cfg.ranges.zero_prob (tuple of 3), cfg.rel_standing_envs (float), cfg.resampling_time_range (tuple) | Sample a base-frame velocity command from a normal distribution (with element-wise zeroing and optional standing). |
Actions
Actions enumerate the terms that transform high-level input vectors into specific joint or base control targets — such as positions, velocities, or torques—along with their associated configuration options.| Action Term | Configurable Options | One-line Meaning |
|---|---|---|
| BinaryJointPositionAction | cfg.joint_names (list of joint names), cfg.open_command_expr (string), cfg.close_command_expr (string) | Map a binary (open/close) action to joint position targets for specified joints. |
| BinaryJointVelocityAction | cfg.joint_names (list of joint names), cfg.open_command_expr (string), cfg.close_command_expr (string) | Map a binary (open/close) action to joint velocity targets for specified joints. |
| JointPositionToLimitsAction | cfg.joint_names (list), cfg.scale (float or dict mapping joint→scale), cfg.rescale_to_limits (bool) | Scale raw inputs and (optionally) rescale to joint position limits before applying as position targets. |
| EMAJointPositionToLimitsAction | cfg.joint_names (list), cfg.scale (float or dict), cfg.rescale_to_limits (bool), cfg.alpha (float or dict mapping joint→EMA weight) | Like JointPositionToLimitsAction, but applies exponential moving average (α) over processed position targets. |
| JointPositionAction | cfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_default_offset (bool) | Affinely transform input actions (scale + offset) and send as joint position targets. |
| RelativeJointPositionAction | cfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_zero_offset (bool) | Affinely transform input and add to current joint positions before sending as position commands. |
| JointVelocityAction | cfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_default_offset (bool) | Affinely transform input and send as joint velocity targets. |
| JointEffortAction | cfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict) | Affinely transform input and send as joint effort targets. |
| NonHolonomicAction | cfg.x_joint_name (str), cfg.y_joint_name (str), cfg.yaw_joint_name (str), cfg.body_name (str), cfg.scale (tuple[2]), cfg.offset (tuple[2]) | Map a 2D action (forward speed, yaw rate) to dummy joint velocities (x, y, yaw) for a skid-steer base. |
| DifferentialInverseKinematicsAction | cfg.joint_names (list), cfg.body_name (str), cfg.scale (list of floats), cfg.controller (DifferentialIK config), cfg.body_offset (optional pos/rot tuples) | Apply scaled end-effector pose commands through a differential IK controller to produce joint position targets. |
| PretrainedPolicyAction | cfg.asset_name (str), cfg.policy_path (str), cfg.low_level_decimation (int), cfg.low_level_actions (ActionCfg), cfg.low_level_observations (ObservationCfg) | Use a pretrained low-level policy (e.g. locomotion) to decode actions in a hierarchical setup (e.g. navigation on top). |
Rewards
Rewards describe the available functions for assigning scalar feedback to agent behaviors, detail any options required to configure them, and explain the physical or task-related criterion they encourage or penalize.| Reward Function | Configurable Options | Description |
|---|---|---|
| is_alive | none | +1 if the episode is still running (not terminated). |
| is_terminated | none | +1 if the episode just terminated (penalizes any termination). |
| is_terminated_term | term_keys (string, list of strings or regex) | Sum of specified non-timeout termination terms. |
| lin_vel_z_l2 | asset_cfg.name | Penalize squared z-axis base linear velocity. |
| ang_vel_xy_l2 | asset_cfg.name | Penalize squared x/y components of base angular velocity. |
| flat_orientation_l2 | asset_cfg.name | Penalize deviation from a flat base orientation (projected gravity). |
| base_height_l2 | target_height (float), asset_cfg.name | Penalize squared deviation of base height from target. |
| body_lin_acc_l2 | asset_cfg.name | Penalize summed L2 norm of all body linear accelerations. |
| joint_torques_l2 | asset_cfg.name | Penalize squared torques on configured joints. |
| joint_vel_l1 | asset_cfg.name | Penalize absolute joint velocities (L1). |
| joint_vel_l2 | asset_cfg.name | Penalize squared joint velocities. |
| joint_acc_l2 | asset_cfg.name | Penalize squared joint accelerations. |
| joint_deviation_l1 | asset_cfg.name | Penalize absolute deviation from default joint positions. |
| joint_pos_limits | asset_cfg.name | Penalize violations of soft joint-position limits. |
| joint_vel_limits | soft_ratio (float), asset_cfg.name | Penalize violations of soft joint-velocity limits (clipped). |
| applied_torque_limits | asset_cfg.name | Penalize mismatch between applied and computed torques. |
| action_rate_l2 | none | Penalize squared change in actions from one step to the next. |
| action_l2 | none | Penalize squared magnitude of the actions themselves. |
| undesired_contacts | threshold (float), sensor_cfg.name | Penalize count of contact events whose force exceeds threshold. |
| contact_forces | threshold (float), sensor_cfg.name | Penalize amount by which contact forces exceed threshold. |
| track_lin_vel_xy_exp | std (float), command_name (str), asset_cfg.name | Reward (exp-kernel) how well base xy-velocity matches command. |
| track_ang_vel_z_exp | std (float), command_name (str), asset_cfg.name | Reward (exp-kernel) how well yaw velocity matches command. |
| air_time | mode_time (float), velocity_threshold (float), asset_cfg.name (str), sensor_cfg.name (str), sensor_cfg.body_names (regex) | Reward for limbs airborne above the velocity threshold. |
| air_time_variance_penalty | sensor_cfg.name (str), sensor_cfg.body_names (regex) | Penalize uneven airborne time across limbs. |
| base_angular_velocity | std (float), asset_cfg.name (str) | Reward (Gaussian) on the magnitude of base angular velocity. |
| base_linear_velocity | std (float), ramp_rate (float), ramp_at_vel (float), asset_cfg.name (str) | Reward (Gaussian with ramp) on base linear velocity. |
| foot_clearance | std (float), tanh_mult (float), target_height (float), asset_cfg.name (str) | Reward for foot clearance above a target height. |
| base_motion | asset_cfg.name (str) | Penalize any base movement (encourages stationarity). |
| base_orientation | asset_cfg.name (str) | Penalize deviation from upright base orientation. |
| foot_slip | asset_cfg.name (str), sensor_cfg.name (str), sensor_cfg.body_names (regex), threshold (float) | Penalize foot-slip events when force exceeds the threshold. |
| joint_pos | asset_cfg.name (str), stand_still_scale (float), velocity_threshold (float) | Penalize deviation from reference joint positions when stationary. |
| joint_torques | asset_cfg.name (str) | Penalize high joint-torque usage on configured joints. |
| termination_penalty | none | Large penalty applied when the episode terminates. |
Task-Specific Reward Terms
| Reward Function | Configurable Options | Description | Task Name |
|---|---|---|---|
| position_tracking_error | std (float), command_name (str) | Reward (tanh) on base position tracking accuracy. | Navigation |
| orientation_tracking_error | command_name (str) | Penalize heading error relative to the commanded orientation. | Navigation |
| object_ee_distance | std (float) | Reward (Gaussian) on the distance between the end-effector and the object. | Manipulation Lift Object |
| object_is_lifted | minimal_height (float) | Reward (binary) for the object being lifted above the minimal height threshold. | Manipulation Lift Object |
| object_goal_distance | std (float), minimal_height (float), command_name (str) | Reward (Gaussian) on the object’s distance to the goal pose after lifting. | Manipulation Lift Object |
| track_lin_vel_xy_exp | std (float), command_name (str), asset_cfg.name | Reward (exp-kernel) how well base xy-velocity matches command. | Velocity Tracking |
| track_ang_vel_z_exp | std (float), command_name (str), asset_cfg.name | Reward (exp-kernel) how well yaw velocity matches command. | Velocity Tracking |
| track_orientation_inv_l2 | object_cfg (SceneEntityCfg("object")), rot_eps (float), command_name (str) | Reward (inverse-L2) encouraging alignment of object orientation with the commanded pose. | Manipulation In-Hand |
| success_bonus_inhand | object_cfg (SceneEntityCfg("object")), command_name (str) | Bonus reward upon successful completion of the in-hand manipulation task. | Manipulation In-Hand |
Robot-Specific Reward Terms
| Reward Function | Robot | Configurable Options | Description |
|---|---|---|---|
| left_wrist_end_effector_position_tracking | Unitree G1 (27 DOF) | asset_cfg.name, body_names, command_name, shift | Penalize deviation of the left-wrist end-effector position from the commanded target. |
| left_wrist_end_effector_position_tracking_fine_grained | Unitree G1 (27 DOF) | asset_cfg.name, body_names, std, command_name, shift | Reward (tanh) on fine-grained left-wrist position tracking accuracy. |
| left_wrist_end_effector_orientation_tracking | Unitree G1 (27 DOF) | asset_cfg.name, body_names, command_name, tol | Penalize orientation error of the left-wrist end-effector relative to the command. |
| left_hand_joint_range_violation | Unitree G1 (27 DOF) | asset_cfg.name, joint_names, limits_min, limits_max | Penalize any left-hand joints outside the specified minimum/maximum limits. |
Events
Events list environment-level operations that introduce randomized or deterministic changes — such as randomizing physical properties or applying external forces—along with the parameters needed to specify how those changes are applied.| Event Function | Configurable Options | Description |
|---|---|---|
| randomize_rigid_body_scale | env_ids, scale_range (tuple[float, float] or dict{x, y, z → tuple[float, float]}), asset_cfg.name, relative_child_path (optional) | Randomize the USD-scale of a rigid‐body asset (or a specified child prim) before simulation starts. |
| randomize_robot_body_material | env_ids, static_friction_range (tuple[float, float]), dynamic_friction_range (tuple[float, float]), restitution_range (tuple[float, float]), num_buckets (int), asset_cfg.name, make_consistent (bool) | Sample a set of friction/restitution “buckets” and assign them to each geometry of the specified robot Articulation. |
| randomize_object_body_material | env_ids, static_friction_range (tuple[float, float]), dynamic_friction_range (tuple[float, float]), restitution_range (tuple[float, float]), num_buckets (int), asset_cfg.name, make_consistent (bool) | Sample a set of friction/restitution “buckets” and assign them to each geometry of the specified RigidObject. |
| randomize_rigid_body_mass | env_ids, asset_cfg.name, mass_distribution_params (tuple[float, float]), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”), recompute_inertia (bool) | Randomize per‐body mass values (add/scale/overwrite) and optionally recompute inertia tensors from default mass. |
| randomize_rigid_body_com | env_ids, com_range (dict{x, y, z → tuple[float, float]}), asset_cfg.name | Add a random offset to the center of mass of each body in the specified Articulation. |
| randomize_rigid_body_collider_offsets | env_ids, rest_offset_distribution_params (tuple[float, float]), contact_offset_distribution_params (tuple[float, float]), asset_cfg.name, distribution (“uniform”/“log_uniform”/“gaussian”) | Randomize physics collider “rest” and/or “contact” offsets for each body shape of the specified asset. |
| randomize_physics_scene_gravity | env_ids, gravity_distribution_params (tuple[list[float], list[float]]), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”) | Randomize the global gravity vector (x, y, z) for the entire physics scene (same gravity for all envs). |
| randomize_actuator_gains | env_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float]), damping_distribution_params (tuple[float, float]), operation, distribution | Randomize stiffness and/or damping gains of actuators (implicit or explicit) on specified joint IDs. |
| randomize_joint_parameters | env_ids, asset_cfg.name, friction_distribution_params (tuple[float, float]), armature_distribution_params (tuple[float, float]), lower_limit_distribution_params (tuple[float, float]), upper_limit_distribution_params (tuple[float, float]), operation, distribution | Randomize joint physics properties: friction, armature, and position limits (lower/upper) by adding/scaling/overwriting. |
| randomize_fixed_tendon_parameters | env_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float]), damping_distribution_params (tuple[float, float]), limit_stiffness_distribution_params (tuple[float, float]), lower_limit_distribution_params (tuple[float, float]), upper_limit_distribution_params (tuple[float, float]), rest_length_distribution_params (tuple[float, float]), offset_distribution_params (tuple[float, float]), operation, distribution | Randomize fixed‐tendon stiffness, damping, limit stiffness, lower/upper position limits, rest length, and/or offset. |
| apply_external_force_torque | env_ids, force_range (tuple[float, float]), torque_range (tuple[float, float]), asset_cfg.name | Sample and buffer random forces and torques to apply to each body of the specified asset (effective on next write). |
| push_by_setting_velocity | env_ids, velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.name | Sample a random root‐velocity (linear + angular) and set it directly on the specified asset to simulate a push. |
| reset_base | env_ids, pose_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.name | Randomize root pose of the robot (translation + rotation) uniformly around default and root velocity, then write to simulation. |
| reset_root_state | env_ids, pose_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.name | Randomize root pose (translation + rotation) of an object uniformly around default and root velocity, then write to simulation. |
| reset_root_state_with_random_orientation | env_ids, pose_range (dict{x, y, z → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.name | Randomize root position uniformly, sample orientation uniformly from SO(3), and set velocity within given ranges. |
| reset_root_state_from_terrain | env_ids, pose_range (dict{roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.name | Sample a valid terrain patch (“init_pos”), set root pose on terrain + random yaw/pitch/roll, then set velocity. |
| reset_joints_by_scale | env_ids, position_range (tuple[float, float]), velocity_range (tuple[float, float]), asset_cfg.name | Scale default joint positions and velocities by random factors (within ranges) and clamp to soft limits before writing. |
| reset_joints_by_offset | env_ids, position_range (tuple[float, float]), velocity_range (tuple[float, float]), asset_cfg.name | Offset default joint positions and velocities by random values (within ranges) and clamp to soft limits before writing. |
| reset_nodal_state_uniform | env_ids, position_range (dict{x, y, z → tuple[float, float]}), velocity_range (dict{x, y, z → tuple[float, float]}), asset_cfg.name | Randomize default nodal (vertex) positions and velocities of a DeformableObject uniformly within given ranges. |
| reset_scene_to_default | env_ids | Restore all rigid objects, articulations (root pose + joint state), and deformables (nodal state) to their default values. |
| randomize_visual_texture_material | env_ids, event_name (str), asset_cfg.name, texture_paths (list[str]), texture_rotation (tuple[float, float]) | Trigger a Replicator‐based event that randomizes textures on each body’s “/visuals” prim under the specified asset. |
| randomize_visual_color | env_ids, event_name (str), asset_cfg.name, colors (list[tuple[float, float, float]] or dict{r, g, b → tuple[float, float]}), mesh_name (str) | Trigger a Replicator‐based event that randomizes color on the specified mesh prims under the asset. |
Terminations
Terminations specify the conditions under which an episode ends, the configuration options that define threshold values or monitored signals, and a brief explanation of what triggers each termination.| Termination Function | Configurable Options | Description |
|---|---|---|
| time_out | none | Terminate when episode length ≥ max_episode_length. |
| command_resample | command_name (str), num_resamples (int) | Terminate when a command has been resampled the specified number of times. |
| bad_orientation | limit_angle (float), asset_cfg.name | Terminate if the asset’s orientation deviates from vertical by > limit_angle. |
| root_height_below_minimum | minimum_height (float), asset_cfg.name | Terminate if the asset’s root height drops below the minimum height. |
| joint_pos_out_of_limit | asset_cfg.name | Terminate if any joint position exceeds the soft position limits. |
| joint_pos_out_of_manual_limit | bounds (tuple[min, max]), asset_cfg.name | Terminate if any joint position falls outside the user-specified bounds. |
| joint_vel_out_of_limit | asset_cfg.name | Terminate if any joint velocity exceeds the soft velocity limits. |
| joint_vel_out_of_manual_limit | max_velocity (float), asset_cfg.name | Terminate if any joint velocity exceeds the given max_velocity. |
| joint_effort_out_of_limit | asset_cfg.name | Terminate if any joint’s computed torque equals its applied torque (indicating effort clipping). |
| base_contact | threshold (float), sensor_cfg.name | Terminate if any contact force at the sensor exceeds the threshold. |
Robot-Specific Terrmination Terms
| Termination Function | Robot | Configurable Options | Description |
|---|---|---|---|
| base_contact_g1 | Unitree G1 (27 DOF) | sensor_cfg.name (str), sensor_cfg.body_names (list[str]), threshold (float) | Terminate episode if any specified non-foot link (e.g. pelvis, torso, waist, shoulders) registers contact force above the threshold. |
Task-Specific Terrmination Terms
| Termination Function | Configurable Options | Description | Task Name |
|---|---|---|---|
| object_out_of_reach | asset_cfg (SceneEntityCfg("object")), threshold (float) | Terminate when the object moves farther than the specified threshold distance from the robot. | Manipulation In-Hand |
| max_consecutive_success | num_success (int), command_name (str) | Terminate once the in-hand manipulation has succeeded the given number of times in a row. | Manipulation In-Hand |