The OpenVLA class provides core functionality for this module.
If True, inference call is run on the local VM, else offloaded onto GRID-Cortex. Defaults to True.
The input RGB image of shape (M,N,3). Predicted action based on the query and image, represented as a 7-DoF vector.
from grid.model.perception.vla.openvla import OpenVLA
car = AirGenCar()
# We will be capturing an image from the AirGen simulator
# and run model inference on it.
img = car.getImage("front_center", "rgb").data
model = OpenVLA(use_local = True)
result = model.run(image=img, prompt = "Close the drawer")
print(result)
This code is licensed under the MIT License.