jd7h / zero123plusplus

Turn an image into a set of images from different 3D angles

  • Public
  • 9.1K runs
  • GitHub
  • Paper
  • License

Run time and cost

This model costs approximately $0.10 to run on Replicate, or 10 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 106 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Zero123++

Zero123++ is a single image to consistent multi-view diffusion base model. The input image needs to be square, and the recommended image resolution is >=320x320

Output views are a fixed set of camera poses relative to the input view:

  • Azimuth: 30, 90, 150, 210, 270, 330.
  • Elevation: 30, -20, 30, -20, 30, -20.

Paper

If you found Zero123++ helpful, please cite the paper:

@misc{shi2023zero123plus,
      title={Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model}, 
      author={Ruoxi Shi and Hansheng Chen and Zhuoyang Zhang and Minghua Liu and Chao Xu and Xinyue Wei and Linghao Chen and Chong Zeng and Hao Su},
      year={2023},
      eprint={2310.15110},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}