Official

stability-ai / stable-diffusion-3

A text-to-image model with greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency

  • Public
  • 1.6M runs
  • $0.035 per image
  • Commercial use
  • GitHub
  • Paper
  • License

Input

string
Shift + Return to add a new line

Default: ""

string

The aspect ratio of your output image. This value is ignored if you are using an input image.

Default: "1:1"

number
(minimum: 0, maximum: 20)

The guidance scale tells the model how similar the output should be to the prompt.

Default: 3.5

file

Input image for image to image mode. The aspect ratio of your output will match this image.

number
(minimum: 0, maximum: 1)

Prompt strength (or denoising strength) when using image to image. 1.0 corresponds to full destruction of information in image.

Default: 0.85

integer
(minimum: 1, maximum: 28)

Number of steps to run the sampler for.

Default: 28

string

Format of the output images

Default: "webp"

integer
(minimum: 0, maximum: 100)

Quality of the output images, from 0 to 100. 100 is best quality, 0 is lowest quality.

Default: 90

integer

Set a seed for reproducibility. Random by default.

string
Shift + Return to add a new line

Negative prompts do not really work in SD3. Using a negative prompt will change your output in unpredictable ways.

Default: ""

Output

output
Generated in

Pricing

Official model
Pricing for official models works differently from other models. Instead of being billed by time, you’re billed by input and output, making pricing more predictable.

This model is priced by how many images are generated.

TypePer unitPer $1
Output
$0.035 / image
or
28 images / $1

For example, generating 100 images should cost around $3.50.

Check out our docs for more information about how per-image pricing works on Replicate.

Readme

Stable Diffusion 3 Medium is a 2 billion parameter text-to-image model developed by Stability AI. It excels at photorealism, typography, and prompt following.

Stable Diffusion 3 on Replicate can be used for commercial work.

Model

Architecture diagram

Stable Diffusion 3 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

For more technical details, please refer to the Research paper.

Safety

As part of our safety-by-design and responsible AI deployment approach, Stability AI implement safety measures throughout the development of our models, from the time we begin pre-training a model to the ongoing development, fine-tuning, and deployment of each model. We have implemented a number of safety mitigations that are intended to reduce the risk of severe harms, however we recommend that developers conduct their own testing and apply additional mitigations based on their specific use cases.

For more about our approach to Safety, please visit our Safety page.