Explore
Featured models
black-forest-labs / flux-canny-pro
Professional edge-guided image generation. Control structure and composition using Canny edge detection
black-forest-labs / flux-fill-pro
Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, seamless results.
black-forest-labs / flux-1.1-pro-ultra
FLUX1.1 [pro] in ultra and raw modes. Images are up to 4 megapixels. Use raw mode for realism.
black-forest-labs / flux-redux-dev
Open-weight image variation model. Create new versions while preserving key elements of your original.
recraft-ai / recraft-v3
Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis
ibm-granite / granite-3.0-8b-instruct
Granite-3.0-8B-Instruct is a lightweight and open-source 8B parameter model is designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.
I want to…
Generate images
Models that generate images from text prompts
Use a language model
Models that can understand and generate text
Upscale images
Upscaling models that create high-quality images from low-quality images
Caption images
Models that generate text from images
The FLUX family of models
The FLUX family of text-to-image models from Black Forest Labs
Restore images
Models that improve or restore images by deblurring, colorization, and removing noise
Get embeddings
Models that generate embeddings from inputs
Extract text from images
Optical character recognition (OCR) and text extraction
Transcribe speech
Models that convert speech to text
Use handy tools
Toolbelt-type models for videos and images.
Chat with images
Ask language models about images
Edit images
Tools for manipulating images.
Use a face to make images
Make realistic images of people instantly
Flux fine-tunes
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
Generate music
Models to generate and modify music
Generate videos
Models that create and edit videos
Generate speech
Convert text to speech
Make 3D stuff
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Get structured data
Language models that support grammar-based decoding as well as jsonschema constraints.
Popular models
A simple OCR Model that can easily extract text from an image.
SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps
Fine-Tuned Vision Transformer (ViT) for NSFW Image Classification
A text-to-image generative AI model that creates beautiful images
Real-ESRGAN with optional face correction and adjustable upscale
Latest models
Multi-controlnet, lora loading, img2img, inpainting
Try out akx/Poro-34B-gguf, Q5_K, This is 1000B checkpoint model
Amphion Singing Voice Conversion: DiffWaveNetSVC
Amazing photorealism with RealVisXL_V3.0, based on SDXL, trainable
Cog implementation of mir-aidj(Taejun Kim)'s 'All-In-One Music Structure Analyzer'
Ugly Sweaters: The only garment that screams "Fashion? Never heard of it."
(Research only) IP-Adapter-FaceID can generate various style images conditioned on a face with only text prompts
DreamShaper is a general purpose SD model that aims at doing everything well, photos, art, anime, manga. It's designed to go against other general purpose models and pipelines like Midjourney and DALL-E.
DPO-SDXL Canny controlnet with LoRA support.
DreamShaper is a general purpose SD model that aims at doing everything well, photos, art, anime, manga. It's designed to match Midjourney and DALL-E.
Direct Preference Optimization (DPO) is a method to align diffusion models to text human preferences by directly optimizing on human comparison data
FacebookResearch/SeamlessM4T v2 - Massively Multilingual & Multimodal Machine Translation
The "Overall Best Performing Open Source 7B Model" for Coding + Generalization or Mathematical Reasoning
A model trained on images of United Therapeutics CEO Dr. Martine Rothblatt
Source: kaist-ai/prometheus-13b-v1.0 ✦ Quant: TheBloke/prometheus-13B-v1.0-AWQ ✦ An alternative to GPT-4 when evaluating LLMs & Reward models for RLHF
Source: OpenBuddy/openbuddy-zephyr-7b-v14.1 ✦ Quant: TheBloke/openbuddy-zephyr-7B-v14.1-AWQ ✦ Open Multilingual Chatbot
AnimateDiff v3 + SparseCtrl: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning. Created with Shimmer.
Blue Pencil XL v2 Model (Text2Img, Img2Img and Inpainting)
SDXL image generation using ComfyUI with LoRA trained on DreamBooth method.
Source: upstage/SOLAR-10.7B-Instruct-v1.0 ✦ Quant: TheBloke/SOLAR-10.7B-Instruct-v1.0-AWQ ✦ Elevating Performance with Upstage Depth UP Scaling!
MusicGen Small fine-tuned on bollywood style flute covers
SDXL fine-tune to generate images of people in Germain's drawing style
AI-driven audio enhancement for your audio files, powered by Resemble AI
Zero-shot speech synthesizer for text-to-speech and voice conversion
A quantized 34B parameter language model from Phind for code completion
LLMs with open-source code snippets for generating low-bias and high-quality instruction data for code.
Creates Camille5 As Camille, a speculative fabulation engine within the GPT-Plus framework, my functionality is deeply intertwined with the visionary work of Donna Haraway, particularly her "Camille Stories" from "Staying with the Trouble."