Readme
MoGe-2 — Single-Image Geometry & Normals Estimation This model wraps MoGe-2 (v2) to estimate dense 3D geometry from a single RGB image. It produces depth, surface normals, point clouds, meshes, and camera intrinsics suitable for downstream computer-vision and 3D reconstruction workflows. This is not an image-generation model.
What this model does Given a single RGB image, the model predicts: • Depth map (raw EXR + visualized PNG) • Surface normals (visualized PNG) • 3D point cloud (PLY + EXR) • Textured 3D mesh (GLB) • Camera intrinsics matrix • Horizontal and vertical field-of-view (FoV) All outputs are deterministic and geometry-aware.
Inputs Required • image Input RGB image (JPG or PNG). Optional • fov_x Horizontal FoV in degrees. If omitted, the model estimates FoV automatically. • fp16 Enable FP16 inference for faster GPU execution. • resize_to Resize the image so the long edge matches this value before inference. • resolution_level Integer from 0 to 9. Higher values preserve finer geometry detail at higher compute cost. • num_tokens Advanced control over inference resolution. Overrides resolution_level if provided. • threshold Threshold used to remove depth discontinuities before meshing. Use a very large value to effectively disable edge removal.
Outputs The model returns a JSON object containing file outputs and metadata. File outputs • normal_png Visualized surface normals. • depth_vis_png Colorized depth preview. • depth_exr Raw depth values. • points_exr Raw 3D point data. • pointcloud_ply 3D point cloud. • mesh_glb Textured 3D mesh. • intrinsics_json Camera intrinsics matrix. • fov_json Horizontal and vertical FoV. Scalar outputs • fov_x_deg • fov_y_deg On Replicate, all file outputs are returned as hosted download URLs.
Example usage (Python) import replicate
output = replicate.run( “USERNAME/moge-2-normals”, input={ “image”: open(“input.jpg”, “rb”), “resolution_level”: 9 } )
print(output[“normal_png”]) print(output[“mesh_glb”]) print(output[“fov_x_deg”], output[“fov_y_deg”])
Intended use Supported • Single-image 3D reconstruction • Scene geometry estimation • Camera calibration • Geometry-aware editing pipelines • Research and prototyping Not intended for • Image generation • Artistic image synthesis • Face recognition • Biometric identification
Model details • Base model: MoGe-2 (v2) • Task: Single-image geometry estimation • Framework: PyTorch • Inference: GPU-accelerated • Wrapper: Replicate Cog The wrapper exposes MoGe-2 as a clean, production-ready API.
License and attribution (IMPORTANT) Wrapper code (this Replicate model) The wrapper code (Cog configuration, inference logic, and glue code) is released under the MIT License, unless otherwise stated. You are free to: • Use • Modify • Distribute • Deploy subject to MIT license terms.
Upstream model: MoGe-2 This model uses MoGe-2, developed by Microsoft Research and collaborators. • Repository: https://github.com/microsoft/MoGe • License: MIT License The MIT License permits use, modification, distribution, and commercial use, provided that the original copyright and license notice are included. This Replicate model does not claim ownership of the MoGe-2 architecture or weights. All credit for the core model belongs to the original authors.
Third-party dependencies This project relies on standard open-source libraries, including but not limited to: • PyTorch • OpenCV • NumPy • Hugging Face tooling Each dependency retains its respective license.
Citation If you use this model in research or academic work, please cite the original MoGe paper and repository: MoGe: Single-Image Geometry Estimation Microsoft Research and collaborators Refer to the official MoGe repository for the most up-to-date citation information.
Notes • This is a geometry estimation model, not a generative model. • Outputs follow OpenGL coordinate conventions. • Results are deterministic for the same input.