Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Upscale images

Upscaling models that create high-quality images from low-quality images

Train a language model

Language models that you can fine-tune using Replicate's training API.

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Latest models

A 14B parameter, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense pro

Ultra high resolution images (up to 4096x4096) based on Stable Cascade

SDXL Canny controlnet with LoRA support.

Seamlessly create stunning product shots by blending with inspirational references for a fresh, modern look

Detect hate speech or toxic comments in tweets/texts

Kolors with style transfer, composition transfer and other IPAdapter techniques

Largest completely open sourced flow-based generation model that is capable of text-to-image generation

A large-scale text-to-image generation model based on latent diffusion, developed by the Kuaishou Kolors team

From Sketch to Reality: Transforming Outlines into Lifelike Images

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities

Run any ComfyUI workflow. Guide: https://github.com/fofr/cog-comfyui

Generate seamless 360 photos using SDXL

Real-ESRGAN for image upscaling on an A100

Face Restoration

A text-to-image model with greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency

MimicMotion: High-quality human motion video generation with pose-guided control

Pinga marvada. Fine-tuned on modão tracks with the text token "modao"

remove background for retailer product images

Make realistic images of real people instantly (w/ ip-adapter-plus-face_sdxl_vit-h)

Qwen 2: A 72 billion parameter language model fine tuned for chat completions

PixArt Sigma 900M is a text-to-image generation model based on the PixArt Sigma architecture

araby.ai oneshot faceswap

Detects if a picture has anime face.

NuminaMath is a series of language models that are trained to solve math problems using tool-integrated reasoning (TIR)

MARS5, a fully open-source (commercially usable) voice-cloning/TTS with break-through prosody and realism.

GPU accelerated replay renderer / video data clipper for comma.ai connect's openpilot route data. SEE README.

The Mistral-7B-Instruct-v0.3 Large Language Model is an instruct fine-tuned version of the Mistral-7B-v0.3

for backsound

Convert speech in audio to text

Generate high resolution image

Cog wrapper for Ollama deepseek-coder-v2:236b

audio to srt

My Cat Xiaobai

Cog wrapper for Ollama llama3:70b

Cog wrapper for Ollama llama3:8b

Input a video. Ask anything about it

YOLOv10: Real-Time End-to-End Object Detection

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Take audio from one video and add it to a second video. Good for adding back audio to liveportrait.

Change the fps of a video without changing its length or speed

Portrait animation using a driving video source

⚡️ Fast audio transcription | whisper large-v3 | speaker diarization | word & sentence level timestamps | prompt | hotwords

Efficient Portrait Animation with Stitching and Retargeting Control

Kolors is a SOTA base image model for high quality image generation

