Claude Sonnet 4
Claude Sonnet 4 is a hybrid reasoning model that offers both near-instant responses and extended thinking capabilities. It significantly improves upon Claude Sonnet 3.7’s performance while maintaining efficiency for everyday use cases.
Key Capabilities
Dual Operating Modes
- Standard mode: Fast responses for typical tasks
- Extended thinking: Deep reasoning for complex problems (up to 64K tokens)
Core Features
- Advanced coding capabilities with 72.7% performance on SWE-bench
- Enhanced instruction following and steerability
- Parallel tool execution
- Memory improvements when given access to local files
- Web search integration during extended thinking (beta)
- 65% reduction in shortcut/loophole behavior compared to Sonnet 3.7
Performance Benchmarks
Coding
- SWE-bench Verified: 72.7%
- Described as “state-of-the-art” for coding tasks
Reasoning (with extended thinking)
- GPQA Diamond: 75.5% (70.0% without extended thinking)
- MMMLU: 88.2% (85.4% without extended thinking)
- MMMU: 77.6% (72.6% without extended thinking)
- AIME: 40.0% (33.1% without extended thinking)
Pricing
- Input: $3 per million tokens
- Output: $15 per million tokens
Safety and Reliability
- Implements AI Safety Level 3 (ASL-3) protections
- Extensive testing and evaluation
- Reduced tendency to use shortcuts or exploit loopholes
- Thinking summaries available (condensed from full reasoning when needed)
Use Cases
Sonnet 4 is optimized for:
- Daily coding tasks and development workflows
- Complex instruction following
- Multi-file codebase operations
- Autonomous application development
- Long-form reasoning and analysis
- Agent-based workflows
Limitations
- Does not match Claude Opus 4 performance in most domains
- Extended thinking features require paid plans
- Memory capabilities depend on developer-provided file access