

prunaai / z-image-turbo
Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
41.7M runs


krthr / clip-embeddings
Generate CLIP (clip-vit-large-patch14) text & image embeddings
53.8M runs


jaaari / kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
91.4M runs


prunaai / p-image-edit
A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
30.1M runs
Alibaba's Happy Horse 1.0 generates videos from text prompts or animates a single image into video. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.
6.9K runs

openai/gpt-image-2OpenAI's state-of-the-art image generation model. Create and edit images from text with strong instruction following, sharp text rendering, and detailed editing.
2M runs

Anthropic's most capable model with a step-change improvement in agentic coding, better vision, and stronger multi-step reasoning
23.7K runs

Google's fast, expressive text-to-speech model with 30 voices and 70+ language support
59.4K runs

Generate full-length songs or instrumentals from a text prompt, with optional auto-generated lyrics
4.9K runs

bytedance/seedance-2.0ByteDance's multimodal video generation model with native audio, multimodal reference inputs, and intelligent duration control.
229.3K runs

Google's cost-efficient video generation model with native audio, optimized for high-volume applications
25.7K runs
prunaai/p-video-avatarp-video-avatar is the fastest and cheapest avatar/lipsync video model on the market.
28.2K runs

bytedance/seedream-5-liteSeedream 5.0 lite: image generation with built-in reasoning, example-based editing, and deep domain knowledge
1.9M runs
Generate videos using xAI's Grok Imagine Video model
804.3K runs

The highest fidelity image model from Black Forest Labs
2.1M runs

Google's fast image generation model with conversational editing, multi-image fusion, and character consistency
8.9M runs
Official models are always on, maintained, and have predictable pricing.

A faster, lighter Recraft image generation model at ~2048px resolution, optimized for high-volume production. Design taste and prompt accuracy at high resolution with better throughput.

A faster, lighter Recraft image generation model optimized for high-volume and production pipelines. Same design taste as V4.1, built for speed and throughput.
Generate detailed SVG vector graphics from text prompts. Recraft V4.1 Pro's design taste with more geometric detail and finer paths — clean layers, editable output, and scalable to any size.
Generate production-ready SVG vector images from text prompts. Recraft V4.1's design taste applied to vector output — clean geometry, structured layers, and editable paths.

Recraft's latest image generation model at ~2048px resolution. Same design taste and prompt accuracy as V4.1, with higher resolution for print-ready and large-scale work.

Recraft's latest image generation model, built around design taste. Strong prompt accuracy, art-directed composition, and integrated text rendering. Fast and cost-efficient at standard resolution.

xAI's higher-quality image model with sharper details, better text rendering, and 2k output

Transcribe speech with ElevenLabs Scribe v2. 90+ languages, word-level timestamps, speaker diarization for up to 32 speakers, audio event tagging, and keyterm biasing. Files up to 3 GB and 10 hours.

Most expressive text-to-speech model from Inworld, with natural-language steering, real-time latency, and multilingual support across 100+ languages.

The first creative upscaler which keeps identity. Stunning photorealistic results, realistic skin, and full creative control.

Convert text to natural-sounding speech with xAI's Grok TTS. 5 voices, 20 languages, expressive speech tags, and high-fidelity MP3 / WAV / telephony audio output.

Transcribe audio to text with xAI's Grok. Handles 25 languages, word-level timestamps, speaker diarization, multichannel audio, and files up to 500 MB.

Granite Speech 4.1 2B is a compact and efficient speech-language model, specifically designed for multilingual automatic speech recognition (ASR) and bidirectional automatic speech translation (AST) for English, French, German, Spanish, Portuguese and Jap
Alibaba's Happy Horse 1.0 generates videos from text prompts or animates a single image into video. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.

Granite-embedding-small-english-r2 is a 47M parameter dense biencoder embedding model from the Granite Embeddings collection that can be used to generate high quality text embeddings.

Granite-4.1-8B is a 8B parameter long-context instruct model finetuned from Granite-4.1-8B-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
PixVerse's flagship video generation model. Generate cinematic videos with synchronized audio, multi-shot sequences, and precise camera control.

Moonshot AI's frontier open model, built for long-horizon coding, agent swarms, and autonomous software engineering. 1 trillion parameters, 262k context window, vision and tool use.

OpenAI's state-of-the-art image generation model. Create and edit images from text with strong instruction following, sharp text rendering, and detailed editing.

Rig any 3D bipedal character mesh
Use AI to generate images & photos with an API
Use AI to understand, describe, and caption videos with an API
Use AI for text-to-speech or to clone your voice via API
Use AI to generate images from a face with an API
Use AI to generate videos with an API
Use AI to upscale and enhance images with an API
Use AI to generate music with an API
Use AI to edit any image via API
Use AI to transcribe speech to text with an API
Use AI For Optical Character Recognition (OCR) to extract text from images via API
Use AI to remove backgrounds from images and videos with an API
FLUX AI models by Black Forest Labs: image generation & editing via API
Use AI to restore images via API
Use AI to upscale, restore, extend, and enhance videos with an API
Detect NSFW content in images and text
Classify text by sentiment, topic, intent, or safety
Identify speakers from audio and video inputs
Replace faces across images with natural-looking results.
Transform rough sketches into polished visuals
Generate custom emojis from text or images
Create anime-style characters, scenes, and animations
Use AI to generate videos from images with an API
Chat with images — visual Q&A, analysis, and reasoning via API
Use AI to generate captions and descriptions from images with an API
Use AI to edit, restyle, extend, and remix videos with an API
WAN family of models: open-source video, image, and audio generation
Generate 3D objects, meshes, and textures from text or images with an API
Official models are always on, predictably priced, and have a stable API.
Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API
Try AI Models for free: video generation, image generation, upscaling, and photo restoration
Use AI to generate lipsync videos with an API
Use AI to control image generation with an API
Embedding models for AI search and analysis
Use AI object detection and segmentation models to distinguish objects in images & videos
Flux fine-tunes: build and run custom AI image models via API
Kontext fine-tunes: Build custom AI image models with an API
Create songs with voice cloning models via API
AI media utilities: auto-caption, watermark, frame extraction & more via API
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.


alpercsv / aegean-villa
Photorealistic Aegean and Mediterranean architecture — villas, interiors, and details with consistent stylistic DNA. Trigger: AEGEANVILLA
18 runs


amxstudio-stack / mein-portrait-stil
7 runs


recraft-ai / recraft-v4.1-utility-pro
A faster, lighter Recraft image generation model at ~2048px resolution, optimized for high-volume production. Design taste and prompt accuracy at high resolution with better throughput.
6 runs


recraft-ai / recraft-v4.1-utility
A faster, lighter Recraft image generation model optimized for high-volume and production pipelines. Same design taste as V4.1, built for speed and throughput.
38 runs

recraft-ai / recraft-v4.1-pro-svg
Generate detailed SVG vector graphics from text prompts. Recraft V4.1 Pro's design taste with more geometric detail and finer paths — clean layers, editable output, and scalable to any size.
10 runs

recraft-ai / recraft-v4.1-svg
Generate production-ready SVG vector images from text prompts. Recraft V4.1's design taste applied to vector output — clean geometry, structured layers, and editable paths.
21 runs


recraft-ai / recraft-v4.1-pro
Recraft's latest image generation model at ~2048px resolution. Same design taste and prompt accuracy as V4.1, with higher resolution for print-ready and large-scale work.
95 runs


recraft-ai / recraft-v4.1
Recraft's latest image generation model, built around design taste. Strong prompt accuracy, art-directed composition, and integrated text rendering. Fast and cost-efficient at standard resolution.
545 runs


aj718 / perkins-alexis
28 runs


spuuntries / dfk3-queeree
12 runs


syncodeofficial / noah-biblical-craft
26 runs

lucataco / motif-video
Motif-Video-2B: a 2B-parameter text-to-video diffusion transformer
38 runs