vufinder/vggt-1b

Feed-forward neural network that directly infers all key 3D attributes of a scene.

Public
7 runs

VGGT: Visual Geometry Grounded Transformer

Visual Geometry Group, University of Oxford; Meta AI

Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, David Novotny

Project page

@inproceedings{wang2025vggt,
  title={VGGT: Visual Geometry Grounded Transformer},
  author={Wang, Jianyuan and Chen, Minghao and Karaev, Nikita and Vedaldi, Andrea and Rupprecht, Christian and Novotny, David},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

Note: This model uses VGGT-1B-Commercial weights.