Readme

Gemini 3 Overview

Introduction

Gemini 3 is marking a major milestone two years into the Gemini era. With billions of users engaging with Gemini-powered products, this release continues Google’s mission to rapidly deliver advanced AI through its full-stack approach — from infrastructure to models to products that reach the world.

What’s New in Gemini 3

Gemini 3 is Google’s most intelligent, multimodal model to date. It unifies and advances every capability from prior generations to help users learn, build, and plan anything. Key improvements include:

State-of-the-art reasoning with deeper contextual understanding and more nuanced interpretations.
Better intent recognition, requiring fewer prompts.

Performance Highlights

Gemini 3 delivers substantial improvements across reasoning, multimodality, coding, and factual accuracy:

#1 on LMArena (1501 Elo)
PhD-level reasoning with top scores on Humanity’s Last Exam and GPQA Diamond
Breakthrough mathematics with state-of-the-art MathArena Apex performance
Leading multimodal understanding with top results on MMMU-Pro and Video-MMMU
High factual reliability, scoring 72.1% on SimpleQA Verified

Gemini 3 Deep Think

A new enhanced-reasoning mode that pushes performance even further, achieving:

41% on Humanity’s Last Exam
93.8% on GPQA Diamond
45.1% on ARC-AGI-2 (with code execution)

Deep Think will be released to Google AI Ultra subscribers after final safety review.

What You Can Do with Gemini 3

Learn Anything

Gemini 3’s million-token context window and multimodal reasoning enable richer understanding and personalized learning:
- Translate and preserve handwritten family recipes
- Convert research papers or long videos into interactive learning tools
- Analyze sports footage and generate personalized training plans
- Explore new web experiences in AI Mode with dynamic, generative UIs

Build Anything

Gemini 3 is Google’s most powerful vibe coding and agentic coding model:
- Exceptional zero-shot generation and complex prompt handling
- Top of WebDev Arena (1487 Elo)
- Advanced tool use (54.2% on Terminal-Bench 2.0)
- Strong agentic performance (76.2% on SWE-bench Verified)

Plan Anything

Gemini 3 excels at long-horizon planning and consistent multi-step execution:
- #1 on Vending-Bench 2 for year-long simulated planning
- Capable of handling real-world tasks like inbox organization or service booking
- Gemini Agent now available to Google AI Ultra subscribers

Responsible AI & Safety

Gemini 3 is Google’s most secure model yet, with extensive evaluations including:
- Reduced sycophancy and stronger prompt-injection resistance
- Enhanced defenses against cyber misuse
- External evaluations by Apollo, Vaultis, Dreadnode, and partnership with bodies like the UK AISI

Looking Ahead

Gemini 3 marks the start of a new chapter focused on advancing intelligence, agents, and personalization. Google will continue iterating rapidly — and looks forward to seeing what users build with it.

Model created 8 months, 1 week ago

Model updated 1 month, 1 week ago