openai/whisper
Convert speech in audio to text
131.3M runs
andreasjansson/clip-features
Return CLIP features for the clip-vit-large-patch14 model
110.3M runs
jaaari/kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
48.5M runs
turian/insanely-fast-whisper-with-video
whisper-large-v3, incredibly fast, with video transcription
6.2M runs
tencent/hunyuan-image-3
A powerful native multimodal model for image generation (PrunaAI squeezed)
669 runs
ibm-granite/granite-4.0-h-small
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
1.4K runs
openai/gpt-5-pro
The smartest, fastest, most useful model yet, with built-in thinking that puts expert-level intelligence in everyone’s hands
28 runs
openai/sora-2
OpenAI's Flagship video generation with synced audio
4.5K runs
character-ai/ovi-i2v
Ovi: generate videos with audio from image and text inputs
924 runs
google/nano-banana
Google's latest image editing model in Gemini 2.5
16.2M runs
leonardoai/lucid-origin
Artistic and high-quality visuals with improved prompt adherence, diversity, and definition
54K runs
bytedance/seedream-4
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
2.1M runs
anthropic/claude-4.5-sonnet
Claude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle
3K runs
wan-video/wan-2.5-t2v
Alibaba Wan 2.5 text to video generation model
5.9K runs
pixverse/pixverse-v5
Create 5s-8s videos with enhanced character movement, visual effects, and exclusive 1080p-8s support. Optimized for anime characters and complex actions
109.1K runs
qwen/qwen-image-edit-plus
The latest Qwen-Image’s iteration with improved multi-image editing, single-image consistency, and native support for ControlNet
656.2K runs
Official models are always on, maintained, and have predictable pricing.
A powerful native multimodal model for image generation (PrunaAI squeezed)
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
The smartest, fastest, most useful model yet, with built-in thinking that puts expert-level intelligence in everyone’s hands
A low cost and fast version of Hailuo 02. Generate 6s and 10s videos in 512p
OpenAI's Most advanced synced-audio video generation
OpenAI's Flagship video generation with synced audio
A cost-efficient version of GPT Image 1
Ovi: generate videos with audio from image and text inputs
Create a series of portrait photos from a single image
A text-to-image model that generates high-resolution images with fine details. It supports various artistic styles and produces diverse outputs from the same prompt, thanks to Query-Key Normalization.
Fast, high quality text-to-video and image-to-video (Also known as Dream Machine)
Affordable and fast vector images
Creative Upscale focuses on enhancing details and refining complex elements in the image. It doesn’t just increase resolution but adds depth by improving textures, fine details, and facial features.
Affordable and fast images
Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis
Designed to make images sharper and cleaner, Crisp Upscale increases overall quality, making visuals suitable for web use or print-ready materials.
Automated background removal for images. Tuned for AI-generated content, product photos, portraits, and design workflows
Convert raster images to high-quality SVG format with precision and clean vector paths, perfect for logos, icons, and scalable graphics.
Recraft V3 SVG (code-named red_panda) is a text-to-image model with the ability to generate high quality SVG images including logotypes, and icons. The model supports a wide list of styles.
A faster and cheaper Imagen 3 model, for when price or speed are more important than final image quality
Use AI To Generate Images & Photos with an API
Use AI To Caption Videos with an API
Convert text to speech
Make realistic images of people instantly
Use AI To Generate Videos with an API
Upscaling models that create high-quality images from low-quality images
Use AI To Generate Music with an API
Use AI To Edit Any Image with an API
Models that convert speech to text
Optical character recognition (OCR) and text extraction
Models that remove backgrounds from images and videos
The FLUX family of text-to-image models from Black Forest Labs
Models that improve or restore images by deblurring, colorization, and removing noise
Upscaling models that create high-quality video from low-quality videos
Use AI To Generate Videos from images with an API
Use AI To Lipsync videos with an API
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate
Models that can understand and generate text
Toolbelt-type models for videos and images.
Use AI To Caption Images with an API
Generate videos with Wan, the fastest and highest quality open-source video generation model.
Ask language models about images
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.
Voice-to-voice cloning and musical prosody
Models that generate embeddings from inputs
Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.
Official models are always on, maintained, and have predictable pricing.
Models that detect or segment objects in images and videos.
Browse the diverse range of fine-tunes the community has custom-trained on Replicate
lucataco/neutts-air
super-realistic, TTS speech language model with instant voice cloning
13 runs
ahmedaldakheel/html2pic
A simple, fast API that converts any HTML/CSS string into a high-quality PNG or JPEG image with customizable dimensions.
9 runs
vufinder/vggt-1b-point
Feed-forward neural network that directly infers all key 3D attributes of a scene.
3 runs
vufinder/vggt-1b-depth
Feed-forward neural network that directly infers all key 3D attributes of a scene.
4 runs
vufinder/vggt-1b
Feed-forward neural network that directly infers all key 3D attributes of a scene.
31 runs
enhance-replicate/flix_qween_2.0.0
36 runs
aihilums/sehatsanjha
36.5K runs
enterval/humanizer-no-thinking-2
Converts AI-generated text to sound more natural and human-like. Based on Surfer/humanizer-no-thinking-2 GGUF model (Q8_0 quantization).
5 runs
tencent/hunyuan-image-3
A powerful native multimodal model for image generation (PrunaAI squeezed)
669 runs
ibm-granite/granite-4.0-h-small
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
1.4K runs
openai/gpt-5-pro
The smartest, fastest, most useful model yet, with built-in thinking that puts expert-level intelligence in everyone’s hands
28 runs
regapruts/somhi-flux
A model trained on authentic 1980s photos to generate vintage-style portraits with film grain, soft lighting, and nostalgic aesthetics. Use SPMRR and "1980 style" for better results.
78 runs