Skip to main content
Generate 6-DoF grasp poses from depth images and segmentation masks (or directly from a point cloud).

Parameters

depth_image
str | PIL.Image | np.ndarray
Depth image. Required if point_cloud is not provided.
seg_image
str | PIL.Image | np.ndarray
Segmentation mask. Required if point_cloud is not provided.
camera_intrinsics
str | np.ndarray
3x3 camera intrinsics matrix. Required if point_cloud is not provided.
point_cloud
str | np.ndarray | List
Optional (N, 3) point cloud. When provided, depth_image/seg_image/camera_intrinsics are ignored.
aux_args
Dict[str, Any]
Auxiliary parameters:
  • num_grasps — Number of grasps to generate
  • gripper_config — Gripper type (e.g. "single_suction_cup_30mm")
  • camera_extrinsics — 4x4 camera extrinsics matrix
timeout
float | None
Optional HTTP timeout.

Returns

dict with keys:
  • grasps — Array of 4x4 grasp poses (N, 4, 4)
  • confidence — Array of confidence scores (N,)
  • latency_ms — Optional server-reported latency

Example

from grid_cortex_client import CortexClient
import numpy as np
from PIL import Image

client = CortexClient()
K = np.eye(3)
aux = {"num_grasps": 128, "gripper_config": "single_suction_cup_30mm", "camera_extrinsics": np.eye(4)}
depth_image = np.load("depth.npy")
seg_image = np.array(Image.open("seg.png"))

res = client.run(
    model_id="graspgen",
    depth_image=depth_image,
    seg_image=seg_image,
    camera_intrinsics=K,
    aux_args=aux,
)
print(res["grasps"].shape)  # (N, 4, 4)