Segment objects in an image using a text prompt. Combines Grounding DINO with SAM 2.Documentation Index
Fetch the complete documentation index at: https://docs.generalrobotics.dev/llms.txt
Use this file to discover all available pages before exploring further.
Parameters
RGB image as file path, URL, PIL Image, or numpy array.
Text description of objects to segment.
Confidence threshold (0.0–1.0) for filtering detections.
Text confidence threshold (0.0–1.0).
Non-Maximum Suppression threshold (0.0–1.0).
Optional HTTP timeout.
Returns
np.ndarray — Binary segmentation mask of shape (H, W) with dtype uint8. Foreground pixels are 255, background is 0.
Example Output
