Readme

Gemini 3 Flash

Frontier intelligence, built for speed.

Gemini 3 Flash is the newest member of the Gemini 3 model family, delivering Pro-grade reasoning, multimodal understanding, and agentic capabilities at Flash-level latency and cost. It brings next-generation intelligence to everyday workflows — fast enough for interactive applications, powerful enough for complex reasoning, and efficient enough to run at massive scale.

Overview

Gemini 3 Flash combines the best of both worlds:

Frontier-level intelligence from Gemini 3 Pro
Ultra-low latency and efficiency from the Flash series
Significantly lower cost without sacrificing quality

Since the launch of Gemini 3, developers and users have processed over 1 trillion tokens per day across the API, using the models to: - Build and vibe-code simulations - Design interactive games and UIs - Reason over text, images, audio, and video - Power agentic workflows and production systems

Gemini 3 Flash makes these capabilities faster and more accessible than ever.

Key Capabilities

⚡ Speed & Efficiency

3× faster than Gemini 2.5 Pro (Artificial Analysis benchmarks)
Uses ~30% fewer tokens on average than Gemini 2.5 Pro for everyday tasks
Dynamically modulates reasoning depth based on task complexity

🧠 Frontier Reasoning

PhD-level performance on challenging benchmarks:
GPQA Diamond: 90.4%
Humanity’s Last Exam: 33.7% (without tools)
MMMU Pro: 81.2% (state-of-the-art, comparable to Gemini 3 Pro)
Outperforms Gemini 2.5 Pro across many academic, scientific, and reasoning tasks

🧩 Multimodal by Default

Native understanding of text, images, audio, and video
Strong performance in:
Visual question answering
Video analysis
Image captioning and UI interpretation
Audio understanding and summarization

🤖 Agentic & Coding Excellence

Designed for high-frequency, iterative workflows
78% on SWE-bench Verified, outperforming:
Gemini 2.5 series
Gemini 3 Pro
Ideal for:
Coding agents
Tool-using systems
Real-time assistants
Production-ready applications

Performance at Scale

Gemini 3 Flash pushes the Pareto frontier of quality vs. cost vs. speed.

Delivers frontier reasoning at Flash latency
Rivals much larger and more expensive frontier models
Optimized for both interactive use and large-scale deployment

Performance is measured using LMArena Elo Score, where Gemini 3 Flash sits on the leading edge alongside Gemini 3 Pro.

Pricing

Gemini 3 Flash is built to be affordable at scale:

Token Type	Price
Input tokens	$0.50 / 1M
Output tokens	$3.00 / 1M
Audio input	$1.00 / 1M

Availability

Gemini 3 Flash is rolling out globally and is available across the Google ecosystem:

For Developers

Gemini API (Google AI Studio)
Gemini CLI
Google Antigravity (agentic development platform)

For Enterprises

Vertex AI
Gemini Enterprise

For Everyone

Default model in the Gemini app (replacing Gemini 2.5 Flash)
Rolling out as the default model in AI Mode in Search

Use Cases

Developers

Agentic coding systems
Interactive apps and games
Real-time A/B testing and UI generation
Multimodal data extraction and analysis
Video and image understanding pipelines

Consumers

Understanding videos, images, and audio instantly
Sketch-to-recognition while drawing
Voice-driven app creation with no prior coding
Personalized quizzes, plans, and explanations

Enterprises

Companies like JetBrains, Bridgewater, Figma, Cursor, Warp, Harvey, Replit, and more are already using Gemini 3 Flash to power fast, intelligent, production-ready systems.

Why Gemini 3 Flash?

Gemini 3 Flash proves that speed, scale, and intelligence don’t have to be trade-offs.

It’s the model you choose when you need: - Frontier reasoning - Multimodal understanding - Agentic performance - Lightning-fast responses - Cost-efficient deployment at scale

Get Started

Gemini 3 Flash is available today.
Start building with frontier intelligence — at Flash speed.

Model created 6 months, 1 week ago

Model updated 1 month, 2 weeks ago