Use official models
Official models are always on, maintained, and have predictable pricing.
Recommended models

minimax / image-01
Minimax's first image model, with character reference support

topazlabs / image-upscale
Professional-grade image upscaling, from Topaz Labs

topazlabs / video-upscale
Video Upscaling from Topaz Labs

fofr / color-matcher
Color match and white balance fixes for images

ibm-granite / granite-3.3-8b-instruct
Granite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.

meta / llama-4-maverick-instruct
A 17 billion parameter model with 128 experts

meta / llama-4-scout-instruct
A 17 billion parameter model with 16 experts

easel / ai-avatars
Use one or two face images to create AI avatars

black-forest-labs / flux-dev-lora
A version of flux-dev, a text to image model, that supports fast fine-tuned lora inference

black-forest-labs / flux-schnell-lora
The fastest image generation model tailored for fine-tuned use

black-forest-labs / flux-fill-dev
Open-weight inpainting model for editing and extending images. Guidance-distilled from FLUX.1 Fill [pro].

black-forest-labs / flux-1.1-pro-ultra
FLUX1.1 [pro] in ultra and raw modes. Images are up to 4 megapixels. Use raw mode for realism.

black-forest-labs / flux-1.1-pro
Faster, better FLUX Pro. Text-to-image model with excellent image quality, prompt adherence, and output diversity.

black-forest-labs / flux-pro
State-of-the-art image generation with top of the line prompt following, visual quality, image detail and output diversity.

black-forest-labs / flux-fill-pro
Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, seamless results.

black-forest-labs / flux-canny-pro
Professional edge-guided image generation. Control structure and composition using Canny edge detection

black-forest-labs / flux-depth-pro
Professional depth-aware image generation. Edit images while preserving spatial relationships.

wavespeedai / wan-2.1-t2v-480p
Accelerated inference for Wan 2.1 14B text to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

wavespeedai / wan-2.1-t2v-720p
Accelerated inference for Wan 2.1 14B text to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

wavespeedai / wan-2.1-i2v-480p
Accelerated inference for Wan 2.1 14B image to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

wavespeedai / wan-2.1-i2v-720p
Accelerated inference for Wan 2.1 14B image to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

google / veo-2
State of the art video generation model. Veo 2 can faithfully follow simple and complex instructions, and convincingly simulates real-world physics as well as a wide range of visual styles.

recraft-ai / recraft-v3
Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis

recraft-ai / recraft-v3-svg
Recraft V3 SVG (code-named red_panda) is a text-to-image model with the ability to generate high quality SVG images including logotypes, and icons. The model supports a wide list of styles.

recraft-ai / recraft-20b-svg
Affordable and fast vector images

recraft-ai / recraft-20b
Affordable and fast images

kwaivgi / kling-v1.6-pro
Generate 5s and 10s videos in 1080p resolution

black-forest-labs / flux-redux-schnell
Fast, efficient image variation model for rapid iteration and experimentation.

black-forest-labs / flux-redux-dev
Open-weight image variation model. Create new versions while preserving key elements of your original.

black-forest-labs / flux-schnell
The fastest image generation model tailored for local development and personal use

black-forest-labs / flux-depth-dev
Open-weight depth-aware image generation. Edit images while preserving spatial relationships.

black-forest-labs / flux-canny-dev
Open-weight edge-guided image generation. Control structure and composition using Canny edge detection.

black-forest-labs / flux-dev
A 12 billion parameter rectified flow transformer capable of generating images from text descriptions

luma / ray-flash-2-720p
Generate 5s and 9s 720p videos, faster and cheaper than Ray 2
luma / ray-flash-2-540p
Generate 5s and 9s 540p videos, faster and cheaper than Ray 2

easel / advanced-face-swap
Face swap one or two people into a target image

ibm-granite / granite-3.2-8b-instruct

ideogram-ai / ideogram-v2a-turbo
Like Ideogram v2 turbo, but now faster and cheaper

ideogram-ai / ideogram-v2a
Like Ideogram v2, but faster and cheaper

anthropic / claude-3.7-sonnet
The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)

minimax / video-01-director
Generate videos with specific camera movements

luma / ray-2-720p
Generate 5s and 9s 720p videos

luma / ray-2-540p
Generate 5s and 9s 540p videos

anthropic / claude-3.5-haiku
Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)

anthropic / claude-3.5-sonnet
Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)

google / imagen-3-fast
A faster and cheaper Imagen 3 model, for when price or speed are more important than final image quality

google / imagen-3
Google's highest quality text-to-image model, capable of generating images with detail, rich lighting and beauty

deepseek-ai / deepseek-r1
A reasoning model trained with reinforcement learning, on par with OpenAI o1

minimax / video-01
Generate 6s videos with prompts or images. (Also known as Hailuo). Use a subject reference to make a video with a character and the S2V-01 model.

recraft-ai / recraft-creative-upscale
Creative Upscale focuses on enhancing details and refining complex elements in the image. It doesn’t just increase resolution but adds depth by improving textures, fine details, and facial features.

recraft-ai / recraft-crisp-upscale
Designed to make images sharper and cleaner, Crisp Upscale increases overall quality, making visuals suitable for web use or print-ready materials.

playht / play-dialog
End-to-end AI speech model designed for natural-sounding conversational speech synthesis, with support for context-aware prosody, intonation, and emotional expression.

kwaivgi / kling-v1.6-standard
Generate 5s and 10s videos in 720p resolution

ibm-granite / granite-3.1-8b-instruct
Granite-3.1-8B-Instruct is a lightweight and open-source 8B parameter model is designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

ibm-granite / granite-3.1-2b-instruct
Granite-3.1-2B-Instruct is a lightweight and open-source 2B parameter model designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

minimax / music-01
Quickly generate up to 1 minute of music with lyrics and vocals in the style of a reference track

minimax / video-01-live
An image-to-video (I2V) model specifically trained for Live2D and general animation use cases

luma / ray
Fast, high quality text-to-video and image-to-video (Also known as Dream Machine)

luma / photon-flash
Accelerated variant of Photon prioritizing speed while maintaining quality

luma / photon
High-quality image generation model optimized for creative professional workflows and ultra-high fidelity outputs

haiper-ai / haiper-video-2
Generate 4s and 6s videos from a prompt or image

stability-ai / stable-diffusion-3.5-medium
2.5 billion parameter image model with improved MMDiT-X architecture

stability-ai / stable-diffusion-3.5-large-turbo
A text-to-image model that generates high-resolution images with fine details. It supports various artistic styles and produces diverse outputs from the same prompt, with a focus on fewer inference steps

stability-ai / stable-diffusion-3.5-large
A text-to-image model that generates high-resolution images with fine details. It supports various artistic styles and produces diverse outputs from the same prompt, thanks to Query-Key Normalization.

ideogram-ai / ideogram-v2-turbo
A fast image model with state of the art inpainting, prompt comprehension and text rendering.

ideogram-ai / ideogram-v2
An excellent image model with state of the art inpainting, prompt comprehension and text rendering

ibm-granite / granite-3.0-8b-instruct
Granite-3.0-8B-Instruct is a lightweight and open-source 8B parameter model is designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

ibm-granite / granite-3.0-2b-instruct
Granite-3.0-2B-Instruct is a lightweight and open-source 2B parameter model designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

ibm-granite / granite-8b-code-instruct-128k
Join the Granite community where you can find numerous recipe workbooks to help you get started with a wide variety of use cases using this model. https://github.com/ibm-granite-community

ibm-granite / granite-20b-code-instruct-8k
Join the Granite community where you can find numerous recipe workbooks to help you get started with a wide variety of use cases using this model. https://github.com/ibm-granite-community

meta / meta-llama-3.1-405b-instruct
Meta's flagship 405 billion parameter language model, fine-tuned for chat completions

stability-ai / stable-diffusion-3
A text-to-image model with greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency

meta / meta-llama-3-70b
Base version of Llama 3, a 70 billion parameter language model from Meta.

meta / meta-llama-3-70b-instruct
A 70 billion parameter language model from Meta, fine tuned for chat completions

meta / meta-llama-3-8b-instruct
An 8 billion parameter language model from Meta, fine tuned for chat completions

meta / meta-llama-3-8b
Base version of Llama 3, an 8 billion parameter language model from Meta.

meta / llama-2-7b-chat
A 7 billion parameter language model from Meta, fine tuned for chat completions

mistralai / mistral-7b-v0.1
A 7 billion parameter language model from Mistral.

meta / llama-2-70b-chat
A 70 billion parameter language model from Meta, fine tuned for chat completions

meta / llama-2-70b
Base version of Llama 2, a 70 billion parameter language model from Meta.

meta / llama-2-13b-chat
A 13 billion parameter language model from Meta, fine tuned for chat completions

meta / llama-2-13b
Base version of Llama 2 13B, a 13 billion parameter language model

meta / llama-2-7b
Base version of Llama 2 7B, a 7 billion parameter language model