adirik / imagedream

Image-Prompt Multi-view Diffusion for 3D Generation

  • Public
  • 1.5K runs
  • L40S
  • GitHub
  • Paper
  • License

Input

image
*file

Image to generate a 3D object from.

*string
Shift + Return to add a new line

Prompt to generate a 3D object.

string
Shift + Return to add a new line

Prompt for the negative class. If not specified, a random prompt will be used.

Default: "ugly, bad anatomy, blurry, pixelated obscure, unnatural colors, poor lighting, dull, and unclear, cropped, lowres, low quality, artifacts, duplicate, morbid, mutilated, poorly drawn face, deformed, dehydrated, bad proportions"

number
(minimum: 1, maximum: 50)

The scale of the guidance loss. Higher values will result in more accurate meshes but may also result in artifacts.

Default: 5

boolean

Whether to use shading in the generated 3D object. ~40% slower but higher quality with shading.

Default: false

integer
(minimum: 5000, maximum: 15000)

Number of iterations to run the model for.

Default: 12500

integer

The seed to use for the generation. If not specified, a random value will be used.

Output

Lobby

Use your mouse to zoom and rotate the model

Loading 0.00%
Generated in

This output was created using a different version of the model, adirik/imagedream:c94d52fa.

Run time and cost

This model costs approximately $2.77 to run on Replicate, or 0 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 48 minutes.

Readme

ImageDream

ImageDream is text and image to 3D model by ByteDance, which leverages a multi-view diffusion model with canonical camera coordination for enhanced geometric and textural accuracy. It excels in creating accurate and detailed 3D objects by utilizing a multi-level image-prompt controller for precise control over the modeling process. Currently outperforming existing state-of-the-art single image 3D model generators, ImageDream demonstrates its superiority in geometry and texture quality through extensive user studies and quantitative evaluations. See the paper and original repository.

How to use the API

To use ImageDream, simply enter a text description and corresponding image of 3D asset you want to generate. Depending on the parameters you set, the 3D model will be generated in 1h-2h. The input arguments are as follows:

  • image: Image of an object to generate a 3D object from. The object should be placed in the center and must not be too small/big in the image.
  • prompt: Short text description of the 3D object to generate.
  • negative_prompt: Short text description of the 3D object to not generate.
  • guidance_scale: The higher the value, the more similar the generated 3D object will be to the inputs.
  • shading: If set to True, the texture of the generated 3D object will be better, but the generation takes ~2h. If set to False, the texture of the generated 3D object will be worse, but the generation takes ~1h.
  • num_steps: Number of training steps. Strongly advised to keep the default value for optimal results.
  • seed: Seed for reproducibility, default value is None. Use default value for random seed. Set to an arbitrary value for deterministic generation.

References

@article{wang2023imagedream, title={ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation}, author={Wang, Peng and Shi, Yichun}, journal={arXiv preprint arXiv:2312.02201}, year={2023} }