Official

openai / o4-mini

OpenAI's fast, lightweight reasoning model (Updated 4 days, 5 hours ago)

  • Public
  • 4.6K runs
  • Priced by multiple properties
  • Commercial use
  • License
Iterate in playground

Input

string
Shift + Return to add a new line

The prompt to send to the model. Do not use if using messages.

string
Shift + Return to add a new line

System prompt to set the assistant's behavior

file[]

List of images to send to the model

Default: []

string

Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

Default: "medium"

integer

Maximum number of completion tokens to generate

Default: 4096

Output

Let us encode the main theme in purely mathematical terms. 1. Basic objects • Let t ∈ [0,T] be “narrative time.” • Let R(t) ∈ ℝ₊ be Achilles’ wrath‐intensity at time t. • Let D(t) ∈ ℝ₊ be the cumulative death‐count (“destruction”) up to time t. • Let H(t) ∈ ℝ₊ be Achilles’ honor (kleos) at time t. • Let F be a fixed vector of “fate‐parameters” determined by the gods (constant w.r.t. t). 2. Governing relations (i) dD/dt = α⋅R(t) + β⋅G(t) – here G(t) models discrete divine interventions; α,β>0. (ii) dH/dt = γ⋅R(t) – δ⋅D(t) – honor rises with wrath but is eroded by mounting destruction; γ,δ>0. (iii) R(0)=R₀>0, H(0)=H₀≥0, D(0)=0, F fixed. 3. Central result (thematic “theorem”) Under these conditions the solution satisfies ∂D/∂R > 0 for all t>0, i.e. D(t) is a strictly increasing function of R(t). 4. Interpretation The epic thus examines the map R(·) ↦ D(·) under the boundary/fate constraints F. Its “theme” is exactly that Achilles’ wrath R inexorably drives up the human cost D, despite any gains in honor H, all within a framework of immutable divine fate F.
Generated in
Input tokens
26
Output tokens
2.2K
Tokens per second
113.61 tokens / second
Time to first token

Pricing

Model pricing for openai/o4-mini. Looking for volume pricing? Get in touch.

$4
per million output tokens

or 250,000 tokens for $1

$1
per million input tokens

or 1,000,000 tokens for $1

Official models are always on, maintained, and have predictable pricing. Learn more.

Check out our docs for more information about how pricing works on Replicate.

Readme

OpenAI o4-mini is a fast, cost-efficient reasoning model designed for high-throughput tasks that benefit from advanced tool use, multimodal input, and strong analytical performance. It represents a major upgrade in the o-series line, offering high accuracy in math, coding, and visual tasks—all while maintaining low latency and usage cost.

Key Features

  • Optimized for math, code, and visual reasoning
  • Agentic tool use: able to use browsing, Python, and image generation tools within ChatGPT or via the API
  • Natural and conversational, with improved instruction following and memory
  • Ideal for applications requiring quick, reliable reasoning at scale

Benchmark Performance

  • AIME 2025 (no tools): 92.7%
  • AIME 2025 (with tools): 100% consensus@8
  • GPQA Diamond: 81.4%
  • SWE-Bench Verified: 68.1%
  • MMMU (Multimodal understanding): 81.6%
  • MathVista (Visual math): 72.0%
  • Scale MultiChallenge (instruction following): 43%
  • Humanity’s Last Exam (deep research with tools): 26.6%

Ideal Use Cases

  • Real-time assistant applications requiring light compute
  • Structured reasoning with tool use (web browsing, Python)
  • Lightweight document analysis or visual interpretation
  • High-throughput workflows with tight latency budgets

Access and Usage

  • Available in the Chat Completions API and Responses API
  • Supported in ChatGPT (Free: Think mode) and ChatGPT Team, Pro, Enterprise, and Edu
  • Accessible via function calling and tool integration in custom applications