Readme
Gemini 3 Flash
Frontier intelligence, built for speed.
Gemini 3 Flash is the newest member of the Gemini 3 model family, delivering Pro-grade reasoning, multimodal understanding, and agentic capabilities at Flash-level latency and cost. It brings next-generation intelligence to everyday workflows — fast enough for interactive applications, powerful enough for complex reasoning, and efficient enough to run at massive scale.
Overview
Gemini 3 Flash combines the best of both worlds:
- Frontier-level intelligence from Gemini 3 Pro
- Ultra-low latency and efficiency from the Flash series
- Significantly lower cost without sacrificing quality
Since the launch of Gemini 3, developers and users have processed over 1 trillion tokens per day across the API, using the models to: - Build and vibe-code simulations - Design interactive games and UIs - Reason over text, images, audio, and video - Power agentic workflows and production systems
Gemini 3 Flash makes these capabilities faster and more accessible than ever.
Key Capabilities
⚡ Speed & Efficiency
- 3× faster than Gemini 2.5 Pro (Artificial Analysis benchmarks)
- Uses ~30% fewer tokens on average than Gemini 2.5 Pro for everyday tasks
- Dynamically modulates reasoning depth based on task complexity
🧠 Frontier Reasoning
- PhD-level performance on challenging benchmarks:
- GPQA Diamond: 90.4%
- Humanity’s Last Exam: 33.7% (without tools)
- MMMU Pro: 81.2% (state-of-the-art, comparable to Gemini 3 Pro)
- Outperforms Gemini 2.5 Pro across many academic, scientific, and reasoning tasks
🧩 Multimodal by Default
- Native understanding of text, images, audio, and video
- Strong performance in:
- Visual question answering
- Video analysis
- Image captioning and UI interpretation
- Audio understanding and summarization
🤖 Agentic & Coding Excellence
- Designed for high-frequency, iterative workflows
- 78% on SWE-bench Verified, outperforming:
- Gemini 2.5 series
- Gemini 3 Pro
- Ideal for:
- Coding agents
- Tool-using systems
- Real-time assistants
- Production-ready applications
Performance at Scale
Gemini 3 Flash pushes the Pareto frontier of quality vs. cost vs. speed.
- Delivers frontier reasoning at Flash latency
- Rivals much larger and more expensive frontier models
- Optimized for both interactive use and large-scale deployment
Performance is measured using LMArena Elo Score, where Gemini 3 Flash sits on the leading edge alongside Gemini 3 Pro.
Pricing
Gemini 3 Flash is built to be affordable at scale:
| Token Type | Price |
|---|---|
| Input tokens | $0.50 / 1M |
| Output tokens | $3.00 / 1M |
| Audio input | $1.00 / 1M |
Availability
Gemini 3 Flash is rolling out globally and is available across the Google ecosystem:
For Developers
- Gemini API (Google AI Studio)
- Gemini CLI
- Google Antigravity (agentic development platform)
For Enterprises
- Vertex AI
- Gemini Enterprise
For Everyone
- Default model in the Gemini app (replacing Gemini 2.5 Flash)
- Rolling out as the default model in AI Mode in Search
Use Cases
Developers
- Agentic coding systems
- Interactive apps and games
- Real-time A/B testing and UI generation
- Multimodal data extraction and analysis
- Video and image understanding pipelines
Consumers
- Understanding videos, images, and audio instantly
- Sketch-to-recognition while drawing
- Voice-driven app creation with no prior coding
- Personalized quizzes, plans, and explanations
Enterprises
Companies like JetBrains, Bridgewater, Figma, Cursor, Warp, Harvey, Replit, and more are already using Gemini 3 Flash to power fast, intelligent, production-ready systems.
Why Gemini 3 Flash?
Gemini 3 Flash proves that speed, scale, and intelligence don’t have to be trade-offs.
It’s the model you choose when you need: - Frontier reasoning - Multimodal understanding - Agentic performance - Lightning-fast responses - Cost-efficient deployment at scale
Get Started
Gemini 3 Flash is available today.
Start building with frontier intelligence — at Flash speed.