GRID Docs | General Robotics

The LLaVANeXT class provides A wrapper for the LLaVANeXT model, which answers questions about visual media (images/videos) using the LLaVANeXT framework.

class LLaVANeXT()

use_local

boolean

default:"True"

If True, inference call is run on the local VM, else offloaded onto GRID-Cortex. Defaults to True.

This model is currently not available via Cortex.

def run()

image

np.ndarray

The input RGB image of shape

(M,N,3)

video

str

The path to the input video.

prompt

str

required

The question to answer about the media.

Returns

str

The response to the prompt.

from grid.model.perception.vlm.llava_next import LLaVANeXT
car = AirGenCar()

# We will be capturing an image from the AirGen simulator 
# and run model inference on it.

img =  car.getImage("front_center", "rgb").data

model = LLaVANeXT(use_local = True)
result = model.run(rgbimage=img, prompt=<prompt>)
print(result)

License
Source

This code is licensed under the Apache 2.0 License.

from grid.model.perception.vlm.llava_next import LLaVANeXT car = AirGenCar() # We will be capturing an image from the AirGen simulator # and run model inference on it. img = car.getImage("front_center", "rgb").data model = LLaVANeXT(use_local = True) result = model.run(rgbimage=img, prompt=<prompt>) print(result)

GET STARTED

OPEN GRID

GRID ENTERPRISE

ROBOT API

SIMULATION

AI LAYER

DEPLOYMENT

DATA GENERATION PIPELINES

FAQ

LLaVANeXT