adirik / marigold

Monocular depth estimation

  • Public
  • 7.1K runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 GPU hardware. Predictions typically complete within 56 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Marigold

Marigold is a diffusion model and associated fine-tuning protocol for monocular depth estimation. See the original repository and paper for details.

API Usage

To use the model, simply provide upload the image (ideally RGB or grayscale) you would like to perform depth estimation for. The API returns two depth map images - one grayscale and one spectral.

Input parameters are as follows:
- image: RGB or grayscale input image for the model, use an RGB image for best results.
- resize_input: whether to resize the input image to max resolution of 768, default to True.
- num_infer: number of inferences to be performed. if >1, multiple depth predictions are ensembled. A higher number yields better results but runs slower.
- denoise_steps: number of inference denoising steps, more steps results in higher accuracy but slower inference speed.
- regularizer_strength: ensembling parameter, weight of optimization regularizer.
- reduction_method: ensembling parameter, method to merge aligned depth maps. Choose between ["mean", "medium"].
- max_iter: ensembling parameter, max number of optimization iterations.
- seed: (optional) seed for reproducibility, set to random if left as None.

References

@misc{ke2023repurposing,
      title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation}, 
      author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
      year={2023},
      eprint={2312.02145},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}