adirik / gaussiandreamer

Fast text-to-3D Gaussian generation by bridging 2D and 3D diffusion models

  • Public
  • 83 runs
  • GitHub
  • Paper

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware.

Readme

Gaussian Dreamer

GaussianDreamer bridges the gap between 2D and 3D diffusion models for fast text-to-3D Gaussian generation. GaussianDreamer leverages 3D diffusion model priors and performs iterative refinement using 2D models (Stable Diffusion 2.1) to improve geometry and texture.

See the original paper and project page and repository for more details.

How to use the API

To use Gaussian Dreamer, simply enter a text description of the 3D asset you would like to generate. Generating a 3D asset takes about 15 minutes, output will be a .ply file. The API input arguments are as follows:

  • prompt: Text prompt to generate 3D asset from.
  • negative_prompt: Text prompt to describe attributes or features you don’t want in your 3D asset.
  • guidance scale: The guidance scale parameter adjusts the influence of the classifier-free guidance in the generation process. Higher values will make the model focus more on the prompt.
  • max_steps: Number of training steps. Strongly advised to keep the default value for optimal results.
  • avatar: If you want to generate a 3D avatar, make it True. Default value is False.
  • seed: Seed for reproducibility, default value is None. Set to an arbitrary value for deterministic generation.

References

@inproceedings{yi2023gaussiandreamer,
  title={GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models},
  author={Yi, Taoran and Fang, Jiemin and Wang, Junjie and Wu, Guanjun and Xie, Lingxi and Zhang, Xiaopeng and Liu, Wenyu and Tian, Qi and Wang, Xinggang},
  year = {2024},
  booktitle = {CVPR}
}