david20321/depth-anything-v3-metric-large

DA3Metric-Large metric depth estimation (Apache 2.0). Returns 16-bit PNG depth map with metric metadata.

Public
714 runs

Depth Anything V3 — Metric Large

Estimate real-world depth in meters from a single image. Returns a 16-bit PNG depth map with metric scale metadata, ready for 3D reconstruction, parallax effects, robotics, or any application that needs actual distances (not just relative ordering).

Based on Depth Anything 3 (DA3METRIC-LARGE) by ByteDance. Apache 2.0 license.

Why this model?

Feature This model Typical DA3 wrappers
Output precision 16-bit PNG (65,535 levels) 8-bit (256 levels)
Metric depth Real distances in meters with decode formula Relative depth only
Inference resolution Up to 1120px (reduces ViT grid artifacts) Default 504px (visible grid)
EXIF handling Auto-normalizes orientation before inference Often ignored (rotated outputs)
Raw float32 output Optional NPZ with full-precision metric depth Not available
Stable API contract Versioned output schema (da3-metric/v1) Unversioned

Quick start

Python

import replicate

output = replicate.run(
    "wolfire/depth-anything-v3-metric-large",
    input={"image": "https://example.com/photo.jpg"},
)

# Output includes metric metadata
print(f"Depth range: {output['depth_min_m']:.2f}m — {output['depth_max_m']:.2f}m")
print(f"Depth map URL: {output['depth_png']}")

JavaScript

import Replicate from "replicate";

const replicate = new Replicate();
const output = await replicate.run("wolfire/depth-anything-v3-metric-large", {
  input: { image: "https://example.com/photo.jpg" },
});

console.log(`Depth range: ${output.depth_min_m}m — ${output.depth_max_m}m`);
console.log(`Depth map: ${output.depth_png}`);

cURL

curl -sS -X POST "https://api.replicate.com/v1/models/wolfire/depth-anything-v3-metric-large/predictions" \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"input": {"image": "https://example.com/photo.jpg"}}'

Decoding the 16-bit depth map

The output depth_png is a 16-bit grayscale PNG. Convert pixel values back to meters using the scale_m_per_unit and offset_m fields from the output:

import numpy as np
from PIL import Image

arr = np.array(Image.open("depth.png"), dtype=np.uint16)
depth_meters = arr.astype(np.float32) * output["scale_m_per_unit"] + output["offset_m"]

The depth_min_m and depth_max_m fields give the 1st/99th percentile depth range used for encoding, so you know the effective metric bounds without decoding.

Inputs

Parameter Type Default Description
image file/URL (required) Input image
focal_length_px float 0 Focal length in pixels. 0 = auto-estimate using 60-degree HFOV heuristic
max_process_res int 0 Cap inference resolution (0 = server default of 1120px). Higher = sharper depth but more VRAM
return_raw_depth bool false Include depth_npz with float32 metric depth (no quantization loss)
include_base64 bool true Include depth_png_base64 for inline transport (no extra download)

Outputs

Field Description
depth_png 16-bit grayscale PNG depth map (hosted URL)
image Alias of depth_png (for Replicate UI preview cards)
depth_npz Float32 metric depth in NPZ format (when return_raw_depth=true)
depth_png_base64 Base64-encoded 16-bit PNG (when include_base64=true)
depth_min_m Near depth bound in meters (1st percentile)
depth_max_m Far depth bound in meters (99th percentile)
scale_m_per_unit Multiply u16 pixel value by this to get meters
offset_m Add this after scaling to get meters
focal_length_used Focal length used for metric conversion (pixels)
process_res_used Actual inference resolution used
contract_version Output schema version (da3-metric/v1)
model_name DA3METRIC-LARGE
model_ref HuggingFace model reference
model_commit Pinned DA3 source commit for reproducibility

Use cases

  • 3D photo effects / parallax — generate depth-based parallax from a single photo
  • Robotics & SLAM — metric depth for obstacle avoidance and mapping
  • AR/VR content — depth-aware compositing and occlusion
  • Visual effects — depth-of-field, fog, and atmospheric perspective
  • Point cloud generation — combine with camera intrinsics for 3D reconstruction

License

Apache 2.0 — the underlying DA3 model and this wrapper are both Apache-licensed.

Citation

@article{depth_anything_3,
  title={Depth Anything 3: Recovering the Visual Space from Any Views},
  author={Lin, Haotong and Chen, Yilun and Liew, Jun Hao and Luo, Jiashi and others},
  journal={arXiv preprint arXiv:2511.10647},
  year={2025}
}
Model created