These large language models understand and generate natural language. They power chatbots, search engines, writing aids, and more.
Use these for:
Language models keep getting bigger and better at these tasks. The largest models today exhibit impressive reasoning skills. But you can get great results from smaller, faster, cheaper models too.
Featured models

Google’s hybrid “thinking” AI model optimized for speed and cost-efficiency
Updated 1 week, 4 days ago
162.9K runs

Claude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle
Updated 2 months, 1 week ago
282.1K runs

openai/gpt-5OpenAI's new model excelling at coding, writing, and reasoning.
Updated 2 months, 2 weeks ago
704.4K runs
Recommended Models
If you want speed and low latency, google/gemini-2.5-flash and openai/gpt-5-nano are strong choices. Both are designed for fast responses and lower compute use while keeping good reasoning quality.
For conversational tasks at scale, anthropic/claude-4.5-haiku also offers quick turnarounds with solid performance.
openai/gpt-5-mini and anthropic/claude-4.5-sonnet both deliver high-quality writing, summarization, and reasoning at a manageable cost.
If you want strong reasoning without high overhead, deepseek-ai/deepseek-r1 and meta/meta-llama-3.1-405b-instruct offer impressive results for their size.
For natural dialogue and chatbots, openai/gpt-5, anthropic/claude-4.5-sonnet, and google/gemini-2.5-flash are all reliable.
They handle multi-turn conversations, context retention, and friendly tone well. Smaller variants like openai/gpt-5-nano or anthropic/claude-4.5-haiku are ideal for lighter-weight chat assistants.
anthropic/claude-4.5-sonnet and deepseek-ai/deepseek-r1 are tuned for structured reasoning, code generation, and debugging support.
openai/gpt-5 also performs well for both natural language and code reasoning tasks, especially in multi-step logic or problem-solving scenarios.
Large language models differ mainly by scale, tuning, and purpose:
These models output natural language text, often in conversational or structured formats.
They can generate, summarize, translate, or explain information, and some also handle light reasoning, analysis, or code generation.
Several models in this collection are open-weight and can be self-hosted, such as meta/meta-llama-3.1-405b-instruct or openai/gpt-oss-120b.
To publish your own model on Replicate, package it with a replicate.yaml defining input and output fields, then push it to your account to run on managed GPUs.
Yes—many of these models are available for commercial use, depending on their license. Always review the License section of each model page before deployment, as some require attribution or restrict redistribution.
You can run them directly on Replicate by providing a text prompt in the model’s playground or via API.
For example, type a question or instruction and receive a natural language response. Some models, like google/gemini-2.5-flash or openai/gpt-4o, may also accept image or multimodal inputs depending on version.
Recommended Models

openai/gpt-oss-120b120b open-weight language model from OpenAI
Updated 3 weeks, 1 day ago
130.2K runs

deepseek-ai/deepseek-v3.1Latest hybrid thinking model from Deepseek
Updated 3 weeks, 2 days ago
136.6K runs

Grok 4 is xAI’s most advanced reasoning model. Excels at logical thinking and in-depth analysis. Ideal for insightful discussions and complex problem-solving.
Updated 3 weeks, 2 days ago
12.7K runs

deepseek-ai/deepseek-r1A reasoning model trained with reinforcement learning, on par with OpenAI o1
Updated 3 weeks, 2 days ago
2.1M runs

meta/meta-llama-3.1-405b-instructMeta's flagship 405 billion parameter language model, fine-tuned for chat completions
Updated 3 weeks, 2 days ago
6.8M runs

openai/gpt-oss-20b20b open-weight language model from OpenAI
Updated 3 weeks, 3 days ago
78K runs

Claude Haiku 4.5 gives you similar levels of coding performance but at one-third the cost and more than twice the speed
Updated 1 month, 2 weeks ago
25.3K runs

openai/gpt-5-nanoFastest, most cost-effective GPT-5 model from OpenAI
Updated 2 months, 1 week ago
2.5M runs

ibm-granite/granite-3.3-8b-instructGranite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.
Updated 2 months, 2 weeks ago
1.6M runs

openai/gpt-5-miniFaster version of OpenAI's flagship GPT-5 model
Updated 2 months, 2 weeks ago
455.1K runs

openai/gpt-4.1OpenAI's Flagship GPT model for complex tasks.
Updated 2 months, 2 weeks ago
255.9K runs

openai/gpt-4.1-nanoFastest, most cost-effective GPT-4.1 model from OpenAI
Updated 2 months, 2 weeks ago
597.6K runs

openai/gpt-4.1-miniFast, affordable version of GPT-4.1
Updated 2 months, 2 weeks ago
1.3M runs

openai/gpt-4oOpenAI's high-intelligence chat model
Updated 2 months, 3 weeks ago
318K runs

openai/o4-miniOpenAI's fast, lightweight reasoning model
Updated 3 months, 3 weeks ago
365.6K runs

openai/o1-miniA small model alternative to o1
Updated 3 months, 3 weeks ago
3.2K runs

openai/o1OpenAI's first o-series reasoning model
Updated 3 months, 3 weeks ago
16.1K runs

openai/gpt-4o-miniLow latency, low cost version of OpenAI's GPT-4o model
Updated 3 months, 3 weeks ago
6.3M runs

Updated Qwen3 model for instruction following
Updated 4 months ago
130K runs

Claude Sonnet 4 is a significant upgrade to 3.7, delivering superior coding and reasoning while responding more precisely to your instructions
Updated 5 months, 3 weeks ago
1.2M runs

deepseek-ai/deepseek-v3DeepSeek-V3-0324 is the leading non-reasoning model, a milestone for open source
Updated 8 months, 1 week ago
4.2M runs

The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)
Updated 9 months, 1 week ago
3.3M runs

Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)
Updated 9 months, 3 weeks ago
2.8M runs

Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)
Updated 9 months, 3 weeks ago
595.5K runs

yorickvp/llava-13bVisual instruction tuning towards large language and vision models with GPT-4 level capabilities
Updated 1 year, 4 months ago
32.8M runs

meta/meta-llama-3-70bBase version of Llama 3, a 70 billion parameter language model from Meta.
Updated 1 year, 7 months ago
849.9K runs

meta/meta-llama-3-70b-instructA 70 billion parameter language model from Meta, fine tuned for chat completions
Updated 1 year, 7 months ago
163.9M runs

meta/meta-llama-3-8b-instructAn 8 billion parameter language model from Meta, fine tuned for chat completions
Updated 1 year, 7 months ago
392.2M runs

meta/meta-llama-3-8bBase version of Llama 3, an 8 billion parameter language model from Meta.
Updated 1 year, 7 months ago
51.1M runs

google-deepmind/gemma-2b-it2B instruct version of Google’s Gemma model
Updated 1 year, 9 months ago
134K runs

yorickvp/llava-v1.6-vicuna-13bLLaVA v1.6: Large Language and Vision Assistant (Vicuna-13B)
Updated 1 year, 10 months ago
3.7M runs

yorickvp/llava-v1.6-mistral-7bLLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)
Updated 1 year, 10 months ago
4.9M runs

stability-ai/stablelm-tuned-alpha-7b7 billion parameter version of Stability AI's language model
Updated 2 years, 7 months ago
140.6K runs

replicate/flan-t5-xlA language model by Google for tasks like classification, summarization, and more
Updated 2 years, 7 months ago
151.2K runs

replicate/llama-7bTransformers implementation of the LLaMA language model
Updated 2 years, 8 months ago
99.4K runs