Input

Run this model in Node.js with one line of code:

npx create-replicate --model=arielreplicate/paella_fast_text2image

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run arielreplicate/paella_fast_text2image using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "arielreplicate/paella_fast_text2image:8523142deec1317a3ae64b3115d1a0a8b1b9bba912ee63ea17675e9cfeda0ed9",
  {
    input: {
      prompt: "Highly detailed photograph of darth vader. artstation",
      num_outputs: 6
    }
  }
);

// To access the file URL:
console.log(output[0].url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output[0]);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run arielreplicate/paella_fast_text2image using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "arielreplicate/paella_fast_text2image:8523142deec1317a3ae64b3115d1a0a8b1b9bba912ee63ea17675e9cfeda0ed9",
    input={
        "prompt": "Highly detailed photograph of darth vader. artstation",
        "num_outputs": 6
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run arielreplicate/paella_fast_text2image using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "arielreplicate/paella_fast_text2image:8523142deec1317a3ae64b3115d1a0a8b1b9bba912ee63ea17675e9cfeda0ed9",
    "input": {
      "prompt": "Highly detailed photograph of darth vader. artstation",
      "num_outputs": 6
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

You can run this model locally using Cog. First, install Cog:

brew install cog

If you don’t have Homebrew, there are other installation options available.

Run this to download the model and run it in your local environment:

cog predict r8.im/arielreplicate/paella_fast_text2image@sha256:8523142deec1317a3ae64b3115d1a0a8b1b9bba912ee63ea17675e9cfeda0ed9 \
  -i 'prompt="Highly detailed photograph of darth vader. artstation"' \
  -i 'num_outputs=6'

To learn more, take a look at the Cog documentation.

Run this to download the model and run it in your local environment:

docker run -d -p 5000:5000 --gpus=all r8.im/arielreplicate/paella_fast_text2image@sha256:8523142deec1317a3ae64b3115d1a0a8b1b9bba912ee63ea17675e9cfeda0ed9
curl -s -X POST \
  -H "Content-Type: application/json" \
  -d $'{
    "input": {
      "prompt": "Highly detailed photograph of darth vader. artstation",
      "num_outputs": 6
    }
  }' \
  http://localhost:5000/predictions

To learn more, take a look at the Cog documentation.

Output

Generated in

5.3 seconds

Tweak itReport View full prediction

This output was created using a different version of the model, arielreplicate/paella_fast_text2image:481ea32f.

Examples

View more examples

Run time and cost

This model costs approximately $0.0011 to run on Replicate, or 909 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 5 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Paella

Conditional text-to-image generation has seen countless recent improvements in terms of quality, diversity and fidelity. Nevertheless, most state-of-the-art models require numerous inference steps to produce faithful generations, resulting in performance bottlenecks for end-user applications. In this paper we introduce Paella, a novel text-to-image model requiring less than 10 steps to sample high-fidelity images, using a speed-optimized architecture allowing to sample a single image in less than 500 ms, while having 573M parameters. The model operates on a compressed & quantized latent space, it is conditioned on CLIP embeddings and uses an improved sampling function over previous works. Aside from text-conditional image generation, our model is able to do latent space interpolation and image manipulations such as inpainting, outpainting, and structural editing.

cover-figure