Users can customize the core MDP components — observations, commands, actions, rewards, events, and terminations — when training RL agents using GRID. By configuring these elements, you can tailor environment feedback and control interfaces to fit specific robotics tasks and training objectives.

Observations

Observations list each measurable quantity the environment can return, the configuration options for specifying which asset or sensor to observe, and a description of what the measurement represents.
Observation FunctionConfigurable OptionsDescription
base_pos_zasset_cfg.nameRoot height in the simulation world frame.
base_lin_velasset_cfg.nameRoot linear velocity in the asset’s root frame.
base_ang_velasset_cfg.nameRoot angular velocity in the asset’s root frame.
projected_gravityasset_cfg.nameGravity projection on the asset’s root frame.
root_pos_wasset_cfg.nameAsset root position in the environment frame.
root_quat_wmake_quat_unique (bool), asset_cfg.nameAsset root orientation (w, x, y, z) in the environment frame.
root_lin_vel_wasset_cfg.nameAsset root linear velocity in the environment frame.
root_ang_vel_wasset_cfg.nameAsset root angular velocity in the environment frame.
body_pose_wasset_cfg.name, body_ids (list[int])Flattened body poses of the asset w.r.t. environment origin (x, y, z, qw, qx, qy, qz) for each configured body.
projected_gravityasset_cfg.name, body_ids (list[int])Direction of gravity projected onto each body frame (x, y, z) for configured bodies.
joint_posasset_cfg.nameJoint positions of the asset.
joint_pos_relasset_cfg.nameJoint positions relative to default joint positions.
joint_pos_limit_normalizedasset_cfg.nameJoint positions normalized by the asset’s soft position limits.
joint_velasset_cfg.nameJoint velocities of the asset.
joint_vel_relasset_cfg.nameJoint velocities relative to default joint velocities.
joint_effortasset_cfg.nameJoint applied effort (torque) for configured joints.
height_scansensor_cfg.name, offset (float)Height scan from the sensor’s frame minus the given offset.
body_incoming_wrenchasset_cfg.name, body_ids (list[int])6-D spatial wrench (force, torque) applied to each configured body link by joint forces.
imu_orientationasset_cfg.nameIMU sensor orientation in world frame as quaternion (w, x, y, z).
imu_ang_velasset_cfg.nameIMU sensor angular velocity (rad/s) in sensor frame.
imu_lin_accasset_cfg.nameIMU sensor linear acceleration (m/s²) in sensor frame.
imagesensor_cfg.name, data_type (str), convert_perspective_to_orthogonal (bool), normalize (bool)Images from a camera or ray-caster camera (e.g., “rgb”, “depth”). Optionally normalize and orthogonalize depth.
image_featuressensor_cfg.name, data_type (str), model_zoo_cfg (dict), model_name (str), model_device (str)Extracted image features via a pretrained encoder (e.g., ResNet, Theia).
last_actionaction_name (str or None)The last input action term (or full action tensor if None).
velocity_commandsparams.command_name (“base_velocity”)Generated command tensor for base‐velocity commands.
ee_pose_commandsparams.command_name (“ee_pose”)Generated command tensor for end‐effector pose commands.
base_pose_commandsparams.command_name (“base_pose”)Generated command tensor for base‐pose commands.
target_object_position_in_robot_root_frameparams.command_name (“object_pose”)Generated command tensor for the target object’s pose in the robot’s root frame.

Robot-Specific Observation Terms

Observation TermRobotConfigurable OptionsDescription
joint_pos_rel_digitAgility Digitnoise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_namesDigit-group joint positions relative to defaults (regex match).
joint_vel_rel_digitAgility Digitnoise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_namesDigit-group joint velocities relative to defaults.
joint_pos_rel_left_armUnitree G1 (27 DOF)noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_namesLeft-arm joint positions relative to defaults (explicit list).
joint_vel_rel_left_armUnitree G1 (27 DOF)noise.n_min, noise.n_max, params.asset_cfg.name, params.asset_cfg.joint_namesLeft-arm joint velocities relative to defaults.

Commands

Commands define the functions available to sample or set asset poses or velocities, along with the configurable parameters that control how those commands are generated or applied.
Command FunctionConfigurable OptionsDescription
UniformPose2dCommandasset_name, cfg.ranges.pos_x (tuple), cfg.ranges.pos_y (tuple), cfg.ranges.heading (tuple), cfg.simple_heading (bool), cfg.resampling_time_range (tuple)Sample a random 2D position around the environment origin and a heading (either toward target or uniform).
TerrainBasedPose2dCommandasset_name, cfg.ranges.heading (tuple), cfg.simple_heading (bool), cfg.resampling_time_range (tuple)Sample a random 2D position from valid terrain patches and a heading (either toward target or uniform).
UniformPoseCommandasset_name, body_name, cfg.ranges.pos_x (tuple), cfg.ranges.pos_y (tuple), cfg.ranges.pos_z (tuple), cfg.ranges.roll (tuple), cfg.ranges.pitch (tuple), cfg.ranges.yaw (tuple), cfg.make_quat_unique (bool), cfg.resampling_time_range (tuple)Sample a random 3D pose (position + quaternion) in the robot’s base frame uniformly within specified ranges.
UniformVelocityCommandasset_name, cfg.ranges.lin_vel_x (tuple), cfg.ranges.lin_vel_y (tuple), cfg.ranges.ang_vel_z (tuple), cfg.heading_command (bool), cfg.ranges.heading (tuple), cfg.rel_heading_envs (float), cfg.rel_standing_envs (float), cfg.resampling_time_range (tuple)Sample a random base-frame linear (x,y) and yaw velocity (either uniform or based on heading error), with optional standing/no-motion environments.
NormalVelocityCommandasset_name, cfg.ranges.mean_vel (tuple of 3), cfg.ranges.std_vel (tuple of 3), cfg.ranges.zero_prob (tuple of 3), cfg.rel_standing_envs (float), cfg.resampling_time_range (tuple)Sample a base-frame velocity command from a normal distribution (with element-wise zeroing and optional standing).

Actions

Actions enumerate the terms that transform high-level input vectors into specific joint or base control targets — such as positions, velocities, or torques—along with their associated configuration options.
Action TermConfigurable OptionsOne-line Meaning
BinaryJointPositionActioncfg.joint_names (list of joint names), cfg.open_command_expr (string), cfg.close_command_expr (string)Map a binary (open/close) action to joint position targets for specified joints.
BinaryJointVelocityActioncfg.joint_names (list of joint names), cfg.open_command_expr (string), cfg.close_command_expr (string)Map a binary (open/close) action to joint velocity targets for specified joints.
JointPositionToLimitsActioncfg.joint_names (list), cfg.scale (float or dict mapping joint→scale), cfg.rescale_to_limits (bool)Scale raw inputs and (optionally) rescale to joint position limits before applying as position targets.
EMAJointPositionToLimitsActioncfg.joint_names (list), cfg.scale (float or dict), cfg.rescale_to_limits (bool), cfg.alpha (float or dict mapping joint→EMA weight)Like JointPositionToLimitsAction, but applies exponential moving average (α) over processed position targets.
JointPositionActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_default_offset (bool)Affinely transform input actions (scale + offset) and send as joint position targets.
RelativeJointPositionActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_zero_offset (bool)Affinely transform input and add to current joint positions before sending as position commands.
JointVelocityActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_default_offset (bool)Affinely transform input and send as joint velocity targets.
JointEffortActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict)Affinely transform input and send as joint effort targets.
NonHolonomicActioncfg.x_joint_name (str), cfg.y_joint_name (str), cfg.yaw_joint_name (str), cfg.body_name (str), cfg.scale (tuple[2]), cfg.offset (tuple[2])Map a 2D action (forward speed, yaw rate) to dummy joint velocities (x, y, yaw) for a skid-steer base.
DifferentialInverseKinematicsActioncfg.joint_names (list), cfg.body_name (str), cfg.scale (list of floats), cfg.controller (DifferentialIK config), cfg.body_offset (optional pos/rot tuples)Apply scaled end-effector pose commands through a differential IK controller to produce joint position targets.
PretrainedPolicyActioncfg.asset_name (str), cfg.policy_path (str), cfg.low_level_decimation (int), cfg.low_level_actions (ActionCfg), cfg.low_level_observations (ObservationCfg)Use a pretrained low-level policy (e.g. locomotion) to decode actions in a hierarchical setup (e.g. navigation on top).

Rewards

Rewards describe the available functions for assigning scalar feedback to agent behaviors, detail any options required to configure them, and explain the physical or task-related criterion they encourage or penalize.
Reward FunctionConfigurable OptionsDescription
is_alivenone+1 if the episode is still running (not terminated).
is_terminatednone+1 if the episode just terminated (penalizes any termination).
is_terminated_termterm_keys (string, list of strings or regex)Sum of specified non-timeout termination terms.
lin_vel_z_l2asset_cfg.namePenalize squared z-axis base linear velocity.
ang_vel_xy_l2asset_cfg.namePenalize squared x/y components of base angular velocity.
flat_orientation_l2asset_cfg.namePenalize deviation from a flat base orientation (projected gravity).
base_height_l2target_height (float), asset_cfg.namePenalize squared deviation of base height from target.
body_lin_acc_l2asset_cfg.namePenalize summed L2 norm of all body linear accelerations.
joint_torques_l2asset_cfg.namePenalize squared torques on configured joints.
joint_vel_l1asset_cfg.namePenalize absolute joint velocities (L1).
joint_vel_l2asset_cfg.namePenalize squared joint velocities.
joint_acc_l2asset_cfg.namePenalize squared joint accelerations.
joint_deviation_l1asset_cfg.namePenalize absolute deviation from default joint positions.
joint_pos_limitsasset_cfg.namePenalize violations of soft joint-position limits.
joint_vel_limitssoft_ratio (float), asset_cfg.namePenalize violations of soft joint-velocity limits (clipped).
applied_torque_limitsasset_cfg.namePenalize mismatch between applied and computed torques.
action_rate_l2nonePenalize squared change in actions from one step to the next.
action_l2nonePenalize squared magnitude of the actions themselves.
undesired_contactsthreshold (float), sensor_cfg.namePenalize count of contact events whose force exceeds threshold.
contact_forcesthreshold (float), sensor_cfg.namePenalize amount by which contact forces exceed threshold.
track_lin_vel_xy_expstd (float), command_name (str), asset_cfg.nameReward (exp-kernel) how well base xy-velocity matches command.
track_ang_vel_z_expstd (float), command_name (str), asset_cfg.nameReward (exp-kernel) how well yaw velocity matches command.
air_timemode_time (float), velocity_threshold (float), asset_cfg.name (str), sensor_cfg.name (str), sensor_cfg.body_names (regex)Reward for limbs airborne above the velocity threshold.
air_time_variance_penaltysensor_cfg.name (str), sensor_cfg.body_names (regex)Penalize uneven airborne time across limbs.
base_angular_velocitystd (float), asset_cfg.name (str)Reward (Gaussian) on the magnitude of base angular velocity.
base_linear_velocitystd (float), ramp_rate (float), ramp_at_vel (float), asset_cfg.name (str)Reward (Gaussian with ramp) on base linear velocity.
foot_clearancestd (float), tanh_mult (float), target_height (float), asset_cfg.name (str)Reward for foot clearance above a target height.
base_motionasset_cfg.name (str)Penalize any base movement (encourages stationarity).
base_orientationasset_cfg.name (str)Penalize deviation from upright base orientation.
foot_slipasset_cfg.name (str), sensor_cfg.name (str), sensor_cfg.body_names (regex), threshold (float)Penalize foot-slip events when force exceeds the threshold.
joint_posasset_cfg.name (str), stand_still_scale (float), velocity_threshold (float)Penalize deviation from reference joint positions when stationary.
joint_torquesasset_cfg.name (str)Penalize high joint-torque usage on configured joints.
termination_penaltynoneLarge penalty applied when the episode terminates.

Task-Specific Reward Terms

Reward FunctionConfigurable OptionsDescriptionTask Name
position_tracking_errorstd (float), command_name (str)Reward (tanh) on base position tracking accuracy.Navigation
orientation_tracking_errorcommand_name (str)Penalize heading error relative to the commanded orientation.Navigation
object_ee_distancestd (float)Reward (Gaussian) on the distance between the end-effector and the object.Manipulation Lift Object
object_is_liftedminimal_height (float)Reward (binary) for the object being lifted above the minimal height threshold.Manipulation Lift Object
object_goal_distancestd (float), minimal_height (float), command_name (str)Reward (Gaussian) on the object’s distance to the goal pose after lifting.Manipulation Lift Object
track_lin_vel_xy_expstd (float), command_name (str), asset_cfg.nameReward (exp-kernel) how well base xy-velocity matches command.Velocity Tracking
track_ang_vel_z_expstd (float), command_name (str), asset_cfg.nameReward (exp-kernel) how well yaw velocity matches command.Velocity Tracking
track_orientation_inv_l2object_cfg (SceneEntityCfg("object")), rot_eps (float), command_name (str)Reward (inverse-L2) encouraging alignment of object orientation with the commanded pose.Manipulation In-Hand
success_bonus_inhandobject_cfg (SceneEntityCfg("object")), command_name (str)Bonus reward upon successful completion of the in-hand manipulation task.Manipulation In-Hand

Robot-Specific Reward Terms

Reward FunctionRobotConfigurable OptionsDescription
left_wrist_end_effector_position_trackingUnitree G1 (27 DOF)asset_cfg.name, body_names, command_name, shiftPenalize deviation of the left-wrist end-effector position from the commanded target.
left_wrist_end_effector_position_tracking_fine_grainedUnitree G1 (27 DOF)asset_cfg.name, body_names, std, command_name, shiftReward (tanh) on fine-grained left-wrist position tracking accuracy.
left_wrist_end_effector_orientation_trackingUnitree G1 (27 DOF)asset_cfg.name, body_names, command_name, tolPenalize orientation error of the left-wrist end-effector relative to the command.
left_hand_joint_range_violationUnitree G1 (27 DOF)asset_cfg.name, joint_names, limits_min, limits_maxPenalize any left-hand joints outside the specified minimum/maximum limits.

Events

Events list environment-level operations that introduce randomized or deterministic changes — such as randomizing physical properties or applying external forces—along with the parameters needed to specify how those changes are applied.
Event FunctionConfigurable OptionsDescription
randomize_rigid_body_scaleenv_ids, scale_range (tuple[float, float] or dict{x, y, z → tuple[float, float]}), asset_cfg.name, relative_child_path (optional)Randomize the USD-scale of a rigid‐body asset (or a specified child prim) before simulation starts.
randomize_robot_body_materialenv_ids, static_friction_range (tuple[float, float]), dynamic_friction_range (tuple[float, float]), restitution_range (tuple[float, float]), num_buckets (int), asset_cfg.name, make_consistent (bool)Sample a set of friction/restitution “buckets” and assign them to each geometry of the specified robot Articulation.
randomize_object_body_materialenv_ids, static_friction_range (tuple[float, float]), dynamic_friction_range (tuple[float, float]), restitution_range (tuple[float, float]), num_buckets (int), asset_cfg.name, make_consistent (bool)Sample a set of friction/restitution “buckets” and assign them to each geometry of the specified RigidObject.
randomize_rigid_body_massenv_ids, asset_cfg.name, mass_distribution_params (tuple[float, float]), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”), recompute_inertia (bool)Randomize per‐body mass values (add/scale/overwrite) and optionally recompute inertia tensors from default mass.
randomize_rigid_body_comenv_ids, com_range (dict{x, y, z → tuple[float, float]}), asset_cfg.nameAdd a random offset to the center of mass of each body in the specified Articulation.
randomize_rigid_body_collider_offsetsenv_ids, rest_offset_distribution_params (tuple[float, float]), contact_offset_distribution_params (tuple[float, float]), asset_cfg.name, distribution (“uniform”/“log_uniform”/“gaussian”)Randomize physics collider “rest” and/or “contact” offsets for each body shape of the specified asset.
randomize_physics_scene_gravityenv_ids, gravity_distribution_params (tuple[list[float], list[float]]), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”)Randomize the global gravity vector (x, y, z) for the entire physics scene (same gravity for all envs).
randomize_actuator_gainsenv_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float]), damping_distribution_params (tuple[float, float]), operation, distributionRandomize stiffness and/or damping gains of actuators (implicit or explicit) on specified joint IDs.
randomize_joint_parametersenv_ids, asset_cfg.name, friction_distribution_params (tuple[float, float]), armature_distribution_params (tuple[float, float]), lower_limit_distribution_params (tuple[float, float]), upper_limit_distribution_params (tuple[float, float]), operation, distributionRandomize joint physics properties: friction, armature, and position limits (lower/upper) by adding/scaling/overwriting.
randomize_fixed_tendon_parametersenv_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float]), damping_distribution_params (tuple[float, float]), limit_stiffness_distribution_params (tuple[float, float]), lower_limit_distribution_params (tuple[float, float]), upper_limit_distribution_params (tuple[float, float]), rest_length_distribution_params (tuple[float, float]), offset_distribution_params (tuple[float, float]), operation, distributionRandomize fixed‐tendon stiffness, damping, limit stiffness, lower/upper position limits, rest length, and/or offset.
apply_external_force_torqueenv_ids, force_range (tuple[float, float]), torque_range (tuple[float, float]), asset_cfg.nameSample and buffer random forces and torques to apply to each body of the specified asset (effective on next write).
push_by_setting_velocityenv_ids, velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameSample a random root‐velocity (linear + angular) and set it directly on the specified asset to simulate a push.
reset_baseenv_ids, pose_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameRandomize root pose of the robot (translation + rotation) uniformly around default and root velocity, then write to simulation.
reset_root_stateenv_ids, pose_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameRandomize root pose (translation + rotation) of an object uniformly around default and root velocity, then write to simulation.
reset_root_state_with_random_orientationenv_ids, pose_range (dict{x, y, z → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameRandomize root position uniformly, sample orientation uniformly from SO(3), and set velocity within given ranges.
reset_root_state_from_terrainenv_ids, pose_range (dict{roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameSample a valid terrain patch (“init_pos”), set root pose on terrain + random yaw/pitch/roll, then set velocity.
reset_joints_by_scaleenv_ids, position_range (tuple[float, float]), velocity_range (tuple[float, float]), asset_cfg.nameScale default joint positions and velocities by random factors (within ranges) and clamp to soft limits before writing.
reset_joints_by_offsetenv_ids, position_range (tuple[float, float]), velocity_range (tuple[float, float]), asset_cfg.nameOffset default joint positions and velocities by random values (within ranges) and clamp to soft limits before writing.
reset_nodal_state_uniformenv_ids, position_range (dict{x, y, z → tuple[float, float]}), velocity_range (dict{x, y, z → tuple[float, float]}), asset_cfg.nameRandomize default nodal (vertex) positions and velocities of a DeformableObject uniformly within given ranges.
reset_scene_to_defaultenv_idsRestore all rigid objects, articulations (root pose + joint state), and deformables (nodal state) to their default values.
randomize_visual_texture_materialenv_ids, event_name (str), asset_cfg.name, texture_paths (list[str]), texture_rotation (tuple[float, float])Trigger a Replicator‐based event that randomizes textures on each body’s “/visuals” prim under the specified asset.
randomize_visual_colorenv_ids, event_name (str), asset_cfg.name, colors (list[tuple[float, float, float]] or dict{r, g, b → tuple[float, float]}), mesh_name (str)Trigger a Replicator‐based event that randomizes color on the specified mesh prims under the asset.

Terminations

Terminations specify the conditions under which an episode ends, the configuration options that define threshold values or monitored signals, and a brief explanation of what triggers each termination.
Termination FunctionConfigurable OptionsDescription
time_outnoneTerminate when episode length ≥ max_episode_length.
command_resamplecommand_name (str), num_resamples (int)Terminate when a command has been resampled the specified number of times.
bad_orientationlimit_angle (float), asset_cfg.nameTerminate if the asset’s orientation deviates from vertical by > limit_angle.
root_height_below_minimumminimum_height (float), asset_cfg.nameTerminate if the asset’s root height drops below the minimum height.
joint_pos_out_of_limitasset_cfg.nameTerminate if any joint position exceeds the soft position limits.
joint_pos_out_of_manual_limitbounds (tuple[min, max]), asset_cfg.nameTerminate if any joint position falls outside the user-specified bounds.
joint_vel_out_of_limitasset_cfg.nameTerminate if any joint velocity exceeds the soft velocity limits.
joint_vel_out_of_manual_limitmax_velocity (float), asset_cfg.nameTerminate if any joint velocity exceeds the given max_velocity.
joint_effort_out_of_limitasset_cfg.nameTerminate if any joint’s computed torque equals its applied torque (indicating effort clipping).
base_contactthreshold (float), sensor_cfg.nameTerminate if any contact force at the sensor exceeds the threshold.

Robot-Specific Terrmination Terms

Termination FunctionRobotConfigurable OptionsDescription
base_contact_g1Unitree G1 (27 DOF)sensor_cfg.name (str), sensor_cfg.body_names (list[str]), threshold (float)Terminate episode if any specified non-foot link (e.g. pelvis, torso, waist, shoulders) registers contact force above the threshold.

Task-Specific Terrmination Terms

Termination FunctionConfigurable OptionsDescriptionTask Name
object_out_of_reachasset_cfg (SceneEntityCfg("object")), threshold (float)Terminate when the object moves farther than the specified threshold distance from the robot.Manipulation In-Hand
max_consecutive_successnum_success (int), command_name (str)Terminate once the in-hand manipulation has succeeded the given number of times in a row.Manipulation In-Hand