const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN
})
const model =
"b
const input = {
prompt:
};
const [output] = await replicate.run(model, { input });
console.log(output);
All the latest models are on Replicate. They’re not just demos — they all actually work and have production-ready APIs.
AI shouldn’t be locked up inside academic papers and demos. Make it real by pushing it to Replicate.
runwayml/gen4-image-turbo
Gen-4 Image Turbo is cheaper and 2.5x faster than Gen-4 Image. An image model with references, use up to 3 reference images to create the exact image you need. Capture every angle.
451 runs
wan-video/wan-2.2-t2v-fast
A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B text-to-video
16.7K runs
openai/gpt-5
OpenAI's new model excelling at coding, writing, and reasoning.
4.5K runs
bytedance/dreamina-3.1
4MP text-to-image generation with enhanced cinematic-quality image generation with precise style control, improved text rendering, and commercial design optimization.
6.7K runs
ideogram-ai/ideogram-character
Generate consistent characters from a single reference image. Outputs can be in many styles. You can also use inpainting to add your character to an existing image.
11.6K runs
runwayml/gen4-aleph
A new way to edit, transform and generate video
1.5K runs
openai/gpt-oss-120b
120b open-weight language model from OpenAI
12.5K runs
qwen/qwen-image
An image generation foundation model in the Qwen series that achieves significant advances in complex text rendering.
27.7K runs
minimax/hailuo-02-fast
A low cost and fast version of Hailuo 02. Generate 6s and 10s videos in 512p
3.1K runs
bytedance/omni-human
Turns your audio/video/images into professional-quality animated videos
1.8K runs
black-forest-labs/flux-krea-dev
An opinionated text-to-image model from Black Forest Labs in collaboration with Krea that excels in photorealism. Creates images that avoid the oversaturated "AI look".
41.6K runs
prunaai/wan-2.2-image
This model generates beautiful cinematic 2 megapixel images in 3-4 seconds and is derived from the Wan 2.2 model through optimisation techniques from the pruna package
30.4K runs
You can get started with any model with just one line of code. But as you do more complex things, you can fine-tune models or deploy your own custom code.
Our community has already published thousands of models that are ready to use in production. You can run these with one line of code.
import replicate
output = replicate.run(
"black-forest-labs/flux-dev",
input={
"aspect_ratio": "1:1",
"num_outputs": 1,
"output_format": "jpg",
"output_quality": 80,
"prompt": "An astronaut riding a rainbow unicorn, cinematic, dramatic",
}
)
print(output)
You can improve models with your own data to create new models that are better suited to specific tasks.
Image models like SDXL can generate images of a particular person, object, or style.
Train a model:
training = replicate.trainings.create(
destination="mattrothenberg/drone-art"
version="ostris/flux-dev-lora-trainer:e440909d3512c31646ee2e0c7d6f6f4923224863a6a10c494606e79fb5844497",
input={
"steps": 1000,
"input_images":
,
"trigger_word": "TOK",
},
)
This will result in a new model:
mattrothenberg/drone-art
Fantastical images of drones on land and in the sky
0 runs
mattrothenberg/drone-art
Fantastical images of drones on land and in the sky
0 runs
Then, you can run it with one line of code:
output = replicate.run(
"mattrothenberg/drone-art:abcde1234...",
input={"prompt": "a photo of TOK forming a rainbow in the sky"}),
)
You aren’t limited to the models on Replicate: you can deploy your own custom models using Cog, our open-source tool for packaging machine learning models.
Cog takes care of generating an API server and deploying it on a big cluster in the cloud. We scale up and down to handle demand, and you only pay for the compute that you use.
First, define the environment your model runs in with cog.yaml:
build:
gpu: true
system_packages:
- "libgl1-mesa-glx"
- "libglib2.0-0"
python_version: "3.10"
python_packages:
- "torch==1.13.1"
predict: "predict.py:Predictor"
Next, define how predictions are run on your model with predict.py:
from cog import BasePredictor, Input, Path
import torch
class Predictor(BasePredictor):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
self.model = torch.load("./weights.pth")
# The arguments and types the model takes as input
def predict(self,
image: Path = Input(description="Grayscale input image")
) -> Path:
"""Run a single prediction on the model"""
processed_image = preprocess(image)
output = self.model(processed_image)
return postprocess(output)
Thousands of businesses are building their AI products on Replicate. Your team can deploy an AI feature in a day and scale to millions of users, without having to be machine learning experts.
Learn more about our enterprise plansIf you get a ton of traffic, Replicate scales up automatically to handle the demand. If you don't get any traffic, we scale down to zero and don't charge you a thing.
Replicate only bills you for how long your code is running. You don't pay for expensive GPUs when you're not using them.
Deploying machine learning models at scale is hard. If you've tried, you know. API servers, weird dependencies, enormous model weights, CUDA, GPUs, batching.
Prediction throughput (requests per second)
Metrics let you keep an eye on how your models are performing, and logs let you zoom in on particular predictions to debug how your model is behaving.
With Replicate and tools like Next.js and Vercel, you can wake up with an idea and watch it hit the front page of Hacker News by the time you go to bed.