The DepthAnything_V2 class implements a wrapper for the DepthAnything_V2 model, which estimates depth
maps from RGB images. The model supports ‘metric’ and ‘relative’ modes, which load
different pre-trained models based on the specified mode. We use the VIT Large encoder.
Flag to specify the mode of the model. Can be ‘metric’ or ‘relative’. Defaults to ‘metric’.
If True, inference call is run on the local VM, else offloaded onto GRID-Cortex. Defaults to False.
The input RGB image of shape (M,N,3). The predicted depth map of shape (M,N).
from grid.model.perception.depth.depth_anything_v2 import DepthAnything_V2
car = AirGenCar()
# We will be capturing an image from the AirGen simulator
# and run model inference on it.
img = car.getImage("front_center", "rgb").data
model = DepthAnything_V2(use_local = False, mode='relative')
result = model.run(rgbimage=img)
print(result.shape)
This code is licensed under the Apache 2.0 License.