Explore
Featured models
![](https://tjzk.replicate.delivery/models_models_featured_image/db4434b4-7b0f-49f7-b78a-774fe9e630a7/batou.jpeg)
batouresearch/high-resolution-controlnet-tile
UPDATE: new upscaling algorithm for a much improved image quality. Fermat.app open-source implementation of an efficient ControlNet 1.1 tile for high-quality upscales. Increase the creativity to encourage hallucination.
![](https://tjzk.replicate.delivery/models_models_featured_image/81ca001f-6a0a-4bef-b2f1-32466887df20/meta-logo.png)
meta/meta-llama-3.1-405b-instruct
Meta's flagship 405 billion parameter language model, fine-tuned for chat completions
![](https://tjzk.replicate.delivery/models_models_featured_image/75b63af2-c04b-477a-b39e-c4734c10f81e/kolors_ipadapter_two.webp)
fofr/kolors-with-ipadapter
Kolors with style transfer, composition transfer and other IPAdapter techniques
![](https://tjzk.replicate.delivery/models_models_featured_image/0411f758-80e6-4794-bd5d-d04198d891a5/image-90.png)
stability-ai/stable-diffusion-3
A text-to-image model with greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency
![](https://tjzk.replicate.delivery/models_models_featured_image/79dabb3c-b8b2-4952-aca4-558b0c8848a0/live-portrait.gif)
fofr/live-portrait
Portrait animation using a driving video source
![](https://tjzk.replicate.delivery/models_models_featured_image/8c0e2917-501a-41ed-aadb-65886a34dcf9/ic-light-featured.png)
zsxkib/ic-light
✍️✨Prompts to auto-magically relights your images
I want to…
Generate images
Models that generate images from text prompts
Use a language model
Models that can understand and generate text
Caption images
Models that generate text from images
Edit images
Tools for manipulating images.
Restore images
Models that improve or restore images by deblurring, colorization, and removing noise
Upscale images
Upscaling models that create high-quality images from low-quality images
Get embeddings
Models that generate embeddings from inputs
Extract text from images
Optical character recognition (OCR) and text extraction
Chat with images
Ask language models about images
Train a language model
Language models that you can fine-tune using Replicate's training API.
Transcribe speech
Models that convert speech to text
Use a face to make images
Make realistic images of people instantly
Use handy tools
Toolbelt-type models for videos and images.
Generate music
Models to generate and modify music
Generate videos
Models that create and edit videos
Generate speech
Convert text to speech
Make 3D stuff
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Get structured data
Language models that support grammar-based decoding as well as jsonschema constraints.
Popular models
SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps
multilingual-e5-large: A multi-language text embedding model
A text-to-image generative AI model that creates beautiful images
Practical face restoration algorithm for *old photos* or *AI-generated faces*
A latent text-to-image diffusion model capable of generating photo-realistic images given any text input
Proteus v0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities.
Latest models
NuminaMath is a series of language models that are trained to solve math problems using tool-integrated reasoning (TIR)
MARS5, a fully open-source (commercially usable) voice-cloning/TTS with break-through prosody and realism.
GPU accelerated replay renderer / video data clipper for comma.ai connect's openpilot route data. SEE README.
The Mistral-7B-Instruct-v0.3 Large Language Model is an instruct fine-tuned version of the Mistral-7B-v0.3
Cog wrapper for Ollama deepseek-coder-v2:236b
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Take audio from one video and add it to a second video. Good for adding back audio to liveportrait.
Change the fps of a video without changing its length or speed
⚡️ Fast audio transcription | whisper large-v3 | speaker diarization | word & sentence level timestamps | prompt | hotwords
Efficient Portrait Animation with Stitching and Retargeting Control
Kolors is a SOTA base image model for high quality image generation
The API automatically detects objects in an input image and returns their positional and mask information.
Bilateral Reference for High-Resolution Dichotomous Image Segmentation (arXiv 2024)
InternLM2.5 has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios.
Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets
Qwen2 57 billion parameter language model from Alibaba Cloud, fine tuned for chat completions
GLM-4V is a multimodal model released by Tsinghua University that is competitive with GPT-4o and establishes a new SOTA on several benchmarks, including OCR.
Convert speech in audio to text w/ `tiny`, `small`, `base`, and `large-v3` models
Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling
Image generation, Inpaint Strength, loras custom_urls and enhancer.
Depth estimation with faster inference speed, fewer parameters, and higher depth accuracy.
Hermes-2 Θ (Theta) 70B is the continuation of our experimental merged model released by Nous Research
Hermes 2 Pro is an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house