from grid.model.perception.detection.gdino import GroundingDINO
car = AirGenCar()

# We will be capturing an image from the AirGen simulator 
# and run model inference on it.

img =  car.getImage("front_center", "rgb").data

model = GroundingDINO(use_local = False)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, scores, labels)

## if you want to use the model locally, set use_local=True
model = GroundingDINO(use_local = True)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, labels)
The GroundingDINO implements a wrapper for the GroundingDINO model, which detects objects in RGB images based on text prompts.
class GroundingDINO()
box_threshold
float
default:"0.4"
Confidence threshold for bounding box detection.
text_threshold
float
default:"0.25"
Confidence threshold for text-based object detection.
use_local
boolean
default:"False"
If True, inference call is run on the local VM, else offloaded onto GRID-Cortex. Defaults to False.
def run()
rgbimage
np.ndarray
required
The input RGB image of shape (M,N,3)(M,N,3).
prompt
str
required
Text prompt for object detection. Multiple prompts can be separated by a ”.”.
Returns
List[float], List[float], List[str]
Returns three lists: bounding boxes coordinates, confidence scores, and label strings.
from grid.model.perception.detection.gdino import GroundingDINO
car = AirGenCar()

# We will be capturing an image from the AirGen simulator 
# and run model inference on it.

img =  car.getImage("front_center", "rgb").data

model = GroundingDINO(use_local = False)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, scores, labels)

## if you want to use the model locally, set use_local=True
model = GroundingDINO(use_local = True)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, labels)
This code is licensed under the Apache 2.0 License.