The GSAM2 class provides a wrapper for the GSAM2 model, which combines the power of Grounding DINO for text-based object detection with SAM2 for high-precision segmentation in RGB images.
If True, inference call is run on the local VM, else offloaded onto GRID-Cortex. Defaults to False.
The input RGB image of shape (M,N,3). The text prompt to use for segmentation.
The predicted segmentation mask of shape (M,N).
from grid.model.perception.segmentation.gsam2 import GSAM2
car = AirGenCar()
# We will be capturing an image from the AirGen simulator
# and run model inference on it.
img = car.getImage("front_center", "rgb").data
model = GSAM2(use_local = False)
result = model.run(rgbimage=img, prompt=<prompt>)
print(result.shape)
This code is licensed under the Apache 2.0 and BSD-3 License.