Users can customize the core MDP components — observations, commands, actions, rewards, events, and terminations — when training RL agents using GRID. By configuring these elements, you can tailor environment feedback and control interfaces to fit specific robotics tasks and training objectives.

Observations

Observations list each measurable quantity the environment can return, the configuration options for specifying which asset or sensor to observe, and a description of what the measurement represents.

Observation FunctionConfigurable OptionsDescription
base_pos_zasset_cfg.nameRoot height in the simulation world frame.
base_lin_velasset_cfg.nameRoot linear velocity in the asset’s root frame.
base_ang_velasset_cfg.nameRoot angular velocity in the asset’s root frame.
projected_gravityasset_cfg.nameGravity projection on the asset’s root frame.
root_pos_wasset_cfg.nameAsset root position in the environment frame.
root_quat_wmake_quat_unique (bool), asset_cfg.nameAsset root orientation (w, x, y, z) in the environment frame.
root_lin_vel_wasset_cfg.nameAsset root linear velocity in the environment frame.
root_ang_vel_wasset_cfg.nameAsset root angular velocity in the environment frame.
body_pose_wasset_cfg.name, body_ids (list[int])Flattened body poses of the asset w.r.t. environment origin (x, y, z, qw, qx, qy, qz) for each configured body.
body_projected_gravity_basset_cfg.name, body_ids (list[int])Direction of gravity projected onto each body frame (x, y, z) for configured bodies.
joint_posasset_cfg.nameJoint positions of the asset.
joint_pos_relasset_cfg.nameJoint positions relative to default joint positions.
joint_pos_limit_normalizedasset_cfg.nameJoint positions normalized by the asset’s soft position limits.
joint_velasset_cfg.nameJoint velocities of the asset.
joint_vel_relasset_cfg.nameJoint velocities relative to default joint velocities.
joint_effortasset_cfg.nameJoint applied effort (torque) for configured joints.
height_scansensor_cfg.name, offset (float)Height scan from the sensor’s frame minus the given offset.
body_incoming_wrenchasset_cfg.name, body_ids (list[int])6-D spatial wrench (force, torque) applied to each configured body link by joint forces.
imu_orientationasset_cfg.nameIMU sensor orientation in world frame as quaternion (w, x, y, z).
imu_ang_velasset_cfg.nameIMU sensor angular velocity (rad/s) in sensor frame.
imu_lin_accasset_cfg.nameIMU sensor linear acceleration (m/s²) in sensor frame.
imagesensor_cfg.name, data_type (str), convert_perspective_to_orthogonal (bool), normalize (bool)Images from a camera or ray-caster camera (e.g., “rgb”, “depth”). Optionally normalize and orthogonalize depth.
image_featuressensor_cfg.name, data_type (str), model_zoo_cfg (dict), model_name (str), model_device (str)Extracted image features via a pretrained encoder (e.g., ResNet, Theia).
last_actionaction_name (str or None)The last input action term (or full action tensor if None).
generated_commandscommand_name (str)The generated command tensor for the given command term.

Commands

Commands define the functions available to sample or set asset poses or velocities, along with the configurable parameters that control how those commands are generated or applied.

Command FunctionConfigurable OptionsDescription
UniformPose2dCommandasset_name, cfg.ranges.pos_x (tuple), cfg.ranges.pos_y (tuple), cfg.ranges.heading (tuple), cfg.simple_heading (bool), cfg.resampling_time_range (tuple)Sample a random 2D position around the environment origin and a heading (either toward target or uniform).
TerrainBasedPose2dCommandasset_name, cfg.ranges.heading (tuple), cfg.simple_heading (bool), cfg.resampling_time_range (tuple)Sample a random 2D position from valid terrain patches and a heading (either toward target or uniform).
UniformPoseCommandasset_name, body_name, cfg.ranges.pos_x (tuple), cfg.ranges.pos_y (tuple), cfg.ranges.pos_z (tuple), cfg.ranges.roll (tuple), cfg.ranges.pitch (tuple), cfg.ranges.yaw (tuple), cfg.make_quat_unique (bool), cfg.resampling_time_range (tuple)Sample a random 3D pose (position + quaternion) in the robot’s base frame uniformly within specified ranges.
UniformVelocityCommandasset_name, cfg.ranges.lin_vel_x (tuple), cfg.ranges.lin_vel_y (tuple), cfg.ranges.ang_vel_z (tuple), cfg.heading_command (bool), cfg.ranges.heading (tuple), cfg.rel_heading_envs (float), cfg.rel_standing_envs (float), cfg.resampling_time_range (tuple)Sample a random base-frame linear (x,y) and yaw velocity (either uniform or based on heading error), with optional standing/no-motion environments.
NormalVelocityCommandasset_name, cfg.ranges.mean_vel (tuple of 3), cfg.ranges.std_vel (tuple of 3), cfg.ranges.zero_prob (tuple of 3), cfg.rel_standing_envs (float), cfg.resampling_time_range (tuple)Sample a base-frame velocity command from a normal distribution (with element-wise zeroing and optional standing).

Actions

Actions enumerate the terms that transform high-level input vectors into specific joint or base control targets — such as positions, velocities, or torques—along with their associated configuration options.

Action TermConfigurable OptionsOne-line Meaning
BinaryJointPositionActioncfg.joint_names (list of joint names), cfg.open_command_expr (string), cfg.close_command_expr (string)Map a binary (open/close) action to joint position targets for specified joints.
BinaryJointVelocityActioncfg.joint_names (list of joint names), cfg.open_command_expr (string), cfg.close_command_expr (string)Map a binary (open/close) action to joint velocity targets for specified joints.
JointPositionToLimitsActioncfg.joint_names (list), cfg.scale (float or dict mapping joint→scale), cfg.rescale_to_limits (bool)Scale raw inputs and (optionally) rescale to joint position limits before applying as position targets.
EMAJointPositionToLimitsActioncfg.joint_names (list), cfg.scale (float or dict), cfg.rescale_to_limits (bool), cfg.alpha (float or dict mapping joint→EMA weight)Like JointPositionToLimitsAction, but applies exponential moving average (α) over processed position targets.
JointPositionActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_default_offset (bool)Affinely transform input actions (scale + offset) and send as joint position targets.
RelativeJointPositionActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_zero_offset (bool)Affinely transform input and add to current joint positions before sending as position commands.
JointVelocityActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict), cfg.use_default_offset (bool)Affinely transform input and send as joint velocity targets.
JointEffortActioncfg.joint_names (list), cfg.scale (float or dict), cfg.offset (float or dict)Affinely transform input and send as joint effort targets.
NonHolonomicActioncfg.x_joint_name (str), cfg.y_joint_name (str), cfg.yaw_joint_name (str), cfg.body_name (str), cfg.scale (tuple[2]), cfg.offset (tuple[2])Map a 2D action (forward speed, yaw rate) to dummy joint velocities (x, y, yaw) for a skid-steer base.
DifferentialInverseKinematicsActioncfg.joint_names (list), cfg.body_name (str), cfg.scale (list of floats), cfg.controller (DifferentialIK config), cfg.body_offset (optional pos/rot tuples)Apply scaled end-effector pose commands through a differential IK controller to produce joint position targets.

Rewards

Rewards describe the available functions for assigning scalar feedback to agent behaviors, detail any options required to configure them, and explain the physical or task-related criterion they encourage or penalize.

Reward FunctionConfigurable OptionsDescription
is_alivenone+1 if the episode is still running (not terminated).
is_terminatednone+1 if the episode just terminated (penalizes any termination).
is_terminated_termterm_keys (string, list of strings or regex)Sum of specified non-timeout termination terms.
lin_vel_z_l2asset_cfg.namePenalize squared z-axis base linear velocity.
ang_vel_xy_l2asset_cfg.namePenalize squared x/y components of base angular velocity.
flat_orientation_l2asset_cfg.namePenalize deviation from a flat base orientation (projected gravity).
base_height_l2target_height (float), asset_cfg.namePenalize squared deviation of base height from target.
body_lin_acc_l2asset_cfg.namePenalize summed L2 norm of all body linear accelerations.
joint_torques_l2asset_cfg.namePenalize squared torques on configured joints.
joint_vel_l1asset_cfg.namePenalize absolute joint velocities (L1).
joint_vel_l2asset_cfg.namePenalize squared joint velocities.
joint_acc_l2asset_cfg.namePenalize squared joint accelerations.
joint_deviation_l1asset_cfg.namePenalize absolute deviation from default joint positions.
joint_pos_limitsasset_cfg.namePenalize violations of soft joint-position limits.
joint_vel_limitssoft_ratio (float), asset_cfg.namePenalize violations of soft joint-velocity limits (clipped).
applied_torque_limitsasset_cfg.namePenalize mismatch between applied and computed torques.
action_rate_l2nonePenalize squared change in actions from one step to the next.
action_l2nonePenalize squared magnitude of the actions themselves.
undesired_contactsthreshold (float), sensor_cfg.namePenalize count of contact events whose force exceeds threshold.
contact_forcesthreshold (float), sensor_cfg.namePenalize amount by which contact forces exceed threshold.
track_lin_vel_xy_expstd (float), command_name (str), asset_cfg.nameReward (exp-kernel) how well base xy-velocity matches command.
track_ang_vel_z_expstd (float), command_name (str), asset_cfg.nameReward (exp-kernel) how well yaw velocity matches command.

Events

Events list environment-level operations that introduce randomized or deterministic changes — such as randomizing physical properties or applying external forces—along with the parameters needed to specify how those changes are applied.

Event FunctionConfigurable OptionsDescription
randomize_rigid_body_scaleenv_ids, scale_range (tuple[float, float] or dict{x, y, z → tuple[float, float]}), asset_cfg.name, relative_child_path (optional)Randomize the USD-scale of a rigid‐body asset (or a specified child prim) before simulation starts.
randomize_rigid_body_materialenv_ids, static_friction_range (tuple[float, float]), dynamic_friction_range (tuple[float, float]), restitution_range (tuple[float, float]), num_buckets (int), asset_cfg.name, make_consistent (bool)Sample a set of friction/restitution “buckets” and assign them to each geometry of the specified RigidObject or Articulation.
randomize_rigid_body_massenv_ids, asset_cfg.name, mass_distribution_params (tuple[float, float]), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”), recompute_inertia (bool)Randomize per‐body mass values (add/scale/overwrite) and optionally recompute inertia tensors from default mass.
randomize_rigid_body_comenv_ids, com_range (dict{x, y, z → tuple[float, float]}), asset_cfg.nameAdd a random offset to the center of mass of each body in the specified Articulation.
randomize_rigid_body_collider_offsetsenv_ids, rest_offset_distribution_params (tuple[float, float]), contact_offset_distribution_params (tuple[float, float]), asset_cfg.name, distribution (“uniform”/“log_uniform”/“gaussian”)Randomize physics collider “rest” and/or “contact” offsets for each body shape of the specified asset.
randomize_physics_scene_gravityenv_ids, gravity_distribution_params (tuple[list[float], list[float]]), operation (“add”/“scale”/“abs”), distribution (“uniform”/“log_uniform”/“gaussian”)Randomize the global gravity vector (x, y, z) for the entire physics scene (same gravity for all envs).
randomize_actuator_gainsenv_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float]), damping_distribution_params (tuple[float, float]), operation, distributionRandomize stiffness and/or damping gains of actuators (implicit or explicit) on specified joint IDs.
randomize_joint_parametersenv_ids, asset_cfg.name, friction_distribution_params (tuple[float, float]), armature_distribution_params (tuple[float, float]), lower_limit_distribution_params (tuple[float, float]), upper_limit_distribution_params (tuple[float, float]), operation, distributionRandomize joint physics properties: friction, armature, and position limits (lower/upper) by adding/scaling/overwriting.
randomize_fixed_tendon_parametersenv_ids, asset_cfg.name, stiffness_distribution_params (tuple[float, float]), damping_distribution_params (tuple[float, float]), limit_stiffness_distribution_params (tuple[float, float]), lower_limit_distribution_params (tuple[float, float]), upper_limit_distribution_params (tuple[float, float]), rest_length_distribution_params (tuple[float, float]), offset_distribution_params (tuple[float, float]), operation, distributionRandomize fixed‐tendon stiffness, damping, limit stiffness, lower/upper position limits, rest length, and/or offset.
apply_external_force_torqueenv_ids, force_range (tuple[float, float]), torque_range (tuple[float, float]), asset_cfg.nameSample and buffer random forces and torques to apply to each body of the specified asset (effective on next write).
push_by_setting_velocityenv_ids, velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameSample a random root‐velocity (linear + angular) and set it directly on the specified asset to simulate a push.
reset_root_state_uniformenv_ids, pose_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameRandomize root pose (translation + rotation) uniformly around default and root velocity, then write to simulation.
reset_root_state_with_random_orientationenv_ids, pose_range (dict{x, y, z → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameRandomize root position uniformly, sample orientation uniformly from SO(3), and set velocity within given ranges.
reset_root_state_from_terrainenv_ids, pose_range (dict{roll, pitch, yaw → tuple[float, float]}), velocity_range (dict{x, y, z, roll, pitch, yaw → tuple[float, float]}), asset_cfg.nameSample a valid terrain patch (“init_pos”), set root pose on terrain + random yaw/pitch/roll, then set velocity.
reset_joints_by_scaleenv_ids, position_range (tuple[float, float]), velocity_range (tuple[float, float]), asset_cfg.nameScale default joint positions and velocities by random factors (within ranges) and clamp to soft limits before writing.
reset_joints_by_offsetenv_ids, position_range (tuple[float, float]), velocity_range (tuple[float, float]), asset_cfg.nameOffset default joint positions and velocities by random values (within ranges) and clamp to soft limits before writing.
reset_nodal_state_uniformenv_ids, position_range (dict{x, y, z → tuple[float, float]}), velocity_range (dict{x, y, z → tuple[float, float]}), asset_cfg.nameRandomize default nodal (vertex) positions and velocities of a DeformableObject uniformly within given ranges.
reset_scene_to_defaultenv_idsRestore all rigid objects, articulations (root pose + joint state), and deformables (nodal state) to their default values.
randomize_visual_texture_materialenv_ids, event_name (str), asset_cfg.name, texture_paths (list[str]), texture_rotation (tuple[float, float])Trigger a Replicator‐based event that randomizes textures on each body’s “/visuals” prim under the specified asset.
randomize_visual_colorenv_ids, event_name (str), asset_cfg.name, colors (list[tuple[float, float, float]] or dict{r, g, b → tuple[float, float]}), mesh_name (str)Trigger a Replicator‐based event that randomizes color on the specified mesh prims under the asset.

Terminations

Terminations specify the conditions under which an episode ends, the configuration options that define threshold values or monitored signals, and a brief explanation of what triggers each termination.

Termination FunctionConfigurable OptionsDescription
time_outnoneTerminate when episode length ≥ max_episode_length.
command_resamplecommand_name (str), num_resamples (int)Terminate when a command has been resampled the specified number of times.
bad_orientationlimit_angle (float), asset_cfg.nameTerminate if the asset’s orientation deviates from vertical by > limit_angle.
root_height_below_minimumminimum_height (float), asset_cfg.nameTerminate if the asset’s root height drops below the minimum height.
joint_pos_out_of_limitasset_cfg.nameTerminate if any joint position exceeds the soft position limits.
joint_pos_out_of_manual_limitbounds (tuple[min, max]), asset_cfg.nameTerminate if any joint position falls outside the user-specified bounds.
joint_vel_out_of_limitasset_cfg.nameTerminate if any joint velocity exceeds the soft velocity limits.
joint_vel_out_of_manual_limitmax_velocity (float), asset_cfg.nameTerminate if any joint velocity exceeds the given max_velocity.
joint_effort_out_of_limitasset_cfg.nameTerminate if any joint’s computed torque equals its applied torque (indicating effort clipping).
illegal_contactthreshold (float), sensor_cfg.nameTerminate if any contact force at the sensor exceeds the threshold.