These large language models understand and generate natural language. They power chatbots, search engines, writing aids, and more.
Use these for:
Language models keep getting bigger and better at these tasks. The largest models today exhibit impressive reasoning skills. But you can get great results from smaller, faster, cheaper models too.
Featured models

google/gemini-2.5-flash
Google’s hybrid “thinking” AI model optimized for speed and cost-efficiency
Updated 1 week, 1 day ago
48.3K runs

anthropic/claude-4.5-sonnet
Claude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle
Updated 1 month, 2 weeks ago
164.7K runs


openai/gpt-5
OpenAI's new model excelling at coding, writing, and reasoning.
Updated 1 month, 4 weeks ago
546K runs
Recommended Models
If you want speed and low latency, google/gemini-2.5-flash and openai/gpt-5-nano are strong choices. Both are designed for fast responses and lower compute use while keeping good reasoning quality.
For conversational tasks at scale, anthropic/claude-4.5-haiku also offers quick turnarounds with solid performance.
openai/gpt-5-mini and anthropic/claude-4.5-sonnet both deliver high-quality writing, summarization, and reasoning at a manageable cost.
If you want strong reasoning without high overhead, deepseek-ai/deepseek-r1 and meta/meta-llama-3.1-405b-instruct offer impressive results for their size.
For natural dialogue and chatbots, openai/gpt-5, anthropic/claude-4.5-sonnet, and google/gemini-2.5-flash are all reliable.
They handle multi-turn conversations, context retention, and friendly tone well. Smaller variants like openai/gpt-5-nano or anthropic/claude-4.5-haiku are ideal for lighter-weight chat assistants.
anthropic/claude-4.5-sonnet and deepseek-ai/deepseek-r1 are tuned for structured reasoning, code generation, and debugging support.
openai/gpt-5 also performs well for both natural language and code reasoning tasks, especially in multi-step logic or problem-solving scenarios.
Large language models differ mainly by scale, tuning, and purpose:
These models output natural language text, often in conversational or structured formats.
They can generate, summarize, translate, or explain information, and some also handle light reasoning, analysis, or code generation.
Several models in this collection are open-weight and can be self-hosted, such as meta/meta-llama-3.1-405b-instruct or openai/gpt-oss-120b.
To publish your own model on Replicate, package it with a replicate.yaml defining input and output fields, then push it to your account to run on managed GPUs.
Yes—many of these models are available for commercial use, depending on their license. Always review the License section of each model page before deployment, as some require attribution or restrict redistribution.
You can run them directly on Replicate by providing a text prompt in the model’s playground or via API.
For example, type a question or instruction and receive a natural language response. Some models, like google/gemini-2.5-flash or openai/gpt-4o, may also accept image or multimodal inputs depending on version.
Recommended Models


openai/gpt-oss-120b
120b open-weight language model from OpenAI
Updated 1 day, 22 hours ago
120.8K runs


deepseek-ai/deepseek-v3.1
Latest hybrid thinking model from Deepseek
Updated 2 days, 23 hours ago
74.5K runs

xai/grok-4
Grok 4 is xAI’s most advanced reasoning model. Excels at logical thinking and in-depth analysis. Ideal for insightful discussions and complex problem-solving.
Updated 2 days, 23 hours ago
3.8K runs


deepseek-ai/deepseek-r1
A reasoning model trained with reinforcement learning, on par with OpenAI o1
Updated 2 days, 23 hours ago
2.1M runs


meta/meta-llama-3.1-405b-instruct
Meta's flagship 405 billion parameter language model, fine-tuned for chat completions
Updated 2 days, 23 hours ago
6.8M runs


openai/gpt-oss-20b
20b open-weight language model from OpenAI
Updated 3 days, 17 hours ago
63.8K runs

anthropic/claude-4.5-haiku
Claude Haiku 4.5 gives you similar levels of coding performance but at one-third the cost and more than twice the speed
Updated 1 month ago
15K runs


openai/gpt-5-nano
Fastest, most cost-effective GPT-5 model from OpenAI
Updated 1 month, 3 weeks ago
1.8M runs


ibm-granite/granite-3.3-8b-instruct
Granite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.
Updated 1 month, 3 weeks ago
1.5M runs


openai/gpt-5-mini
Faster version of OpenAI's flagship GPT-5 model
Updated 1 month, 4 weeks ago
394.6K runs


openai/gpt-4.1
OpenAI's Flagship GPT model for complex tasks.
Updated 1 month, 4 weeks ago
226.3K runs


openai/gpt-4.1-nano
Fastest, most cost-effective GPT-4.1 model from OpenAI
Updated 1 month, 4 weeks ago
513.8K runs


openai/gpt-4.1-mini
Fast, affordable version of GPT-4.1
Updated 1 month, 4 weeks ago
1.3M runs


openai/gpt-4o
OpenAI's high-intelligence chat model
Updated 2 months ago
269.9K runs


openai/o4-mini
OpenAI's fast, lightweight reasoning model
Updated 3 months ago
349.3K runs


openai/o1-mini
A small model alternative to o1
Updated 3 months ago
3.2K runs


openai/o1
OpenAI's first o-series reasoning model
Updated 3 months ago
15.9K runs


openai/gpt-4o-mini
Low latency, low cost version of OpenAI's GPT-4o model
Updated 3 months ago
4M runs

qwen/qwen3-235b-a22b-instruct-2507
Updated Qwen3 model for instruction following
Updated 3 months, 1 week ago
125.8K runs

moonshotai/kimi-k2-instruct
Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities
Updated 3 months, 1 week ago
34.7K runs

anthropic/claude-4-sonnet
Claude Sonnet 4 is a significant upgrade to 3.7, delivering superior coding and reasoning while responding more precisely to your instructions
Updated 5 months ago
1.1M runs


deepseek-ai/deepseek-v3
DeepSeek-V3-0324 is the leading non-reasoning model, a milestone for open source
Updated 7 months, 3 weeks ago
4M runs

anthropic/claude-3.7-sonnet
The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)
Updated 8 months, 3 weeks ago
3.1M runs

anthropic/claude-3.5-haiku
Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)
Updated 9 months ago
2.7M runs

anthropic/claude-3.5-sonnet
Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)
Updated 9 months ago
588.4K runs


yorickvp/llava-13b
Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
Updated 1 year, 3 months ago
32.3M runs


meta/meta-llama-3-70b
Base version of Llama 3, a 70 billion parameter language model from Meta.
Updated 1 year, 6 months ago
844.6K runs


meta/meta-llama-3-70b-instruct
A 70 billion parameter language model from Meta, fine tuned for chat completions
Updated 1 year, 6 months ago
163.4M runs


meta/meta-llama-3-8b-instruct
An 8 billion parameter language model from Meta, fine tuned for chat completions
Updated 1 year, 6 months ago
389.7M runs


meta/meta-llama-3-8b
Base version of Llama 3, an 8 billion parameter language model from Meta.
Updated 1 year, 6 months ago
51.1M runs


google-deepmind/gemma-2b-it
2B instruct version of Google’s Gemma model
Updated 1 year, 8 months ago
134K runs


yorickvp/llava-v1.6-vicuna-13b
LLaVA v1.6: Large Language and Vision Assistant (Vicuna-13B)
Updated 1 year, 9 months ago
3.7M runs


yorickvp/llava-v1.6-mistral-7b
LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)
Updated 1 year, 9 months ago
4.9M runs


stability-ai/stablelm-tuned-alpha-7b
7 billion parameter version of Stability AI's language model
Updated 2 years, 6 months ago
140.6K runs


replicate/flan-t5-xl
A language model by Google for tasks like classification, summarization, and more
Updated 2 years, 6 months ago
151.2K runs


replicate/llama-7b
Transformers implementation of the LLaMA language model
Updated 2 years, 8 months ago
99.4K runs