Official

anthropic / claude-3.5-haiku

Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)

  • Public
  • 1.2M runs
  • Priced per token
  • Commercial use
  • License
Iterate in playground

Input

*string
Shift + Return to add a new line

Input prompt

string
Shift + Return to add a new line

System prompt

Default: ""

integer
(minimum: 1, maximum: 8192)

Maximum number of output tokens

Default: 8192

Output

Here's a meta-haiku about haiku itself: Syllables dancing Five, seven, five—poetry Speaks of its own form
Generated in
Input tokens
13
Output tokens
35
Tokens per second
29.48 tokens / second
Time to first token

Pricing

Official model
Pricing for official models works differently from other models. Instead of being billed by time, you’re billed by input and output, making pricing more predictable.

This model is priced by how many input tokens are sent and how many output tokens are generated.

TypePer unitPer $1
Input
$1.00 / 1M tokens
or
1M tokens / $1
Output
$5.00 / 1M tokens
or
200K tokens / $1

For example, for $10 you can run around 2,857 predictions where the input is a sentence or two (15 tokens) and the output is a few paragraphs (700 tokens).

Check out our docs for more information about how per-token pricing works on Replicate.

Readme

Claude 3.5 Haiku

Claude 3.5 Haiku is Anthropic’s fastest model, delivering advanced coding, tool use, and reasoning capabilities at an accessible price point.

This Replicate model is built on claude-3-5-haiku-20241022, hosted on Anthropic and Vertex APIs.

Key features

  • Improved performance across all skill sets compared to Claude 3 Haiku
  • Surpasses Claude 3 Opus on many intelligence benchmarks
  • Enhanced instruction following and tool use accuracy
  • Rapid inference speed and response times
  • Cost-effective pricing structure
  • Strong reasoning capabilities
  • Accurate code generation with multi-turn refinement
  • Natural writing style and enhanced conversational abilities

Use cases

  • Code completions and development workflows
  • Interactive chatbots and customer service
  • Data extraction and automated labeling
  • Real-time content moderation
  • High-volume user interactions
  • Personalized experience generation

Limitations

  • No persistent memory between conversations
  • Cannot access real-time information
  • May occasionally produce inaccurate responses
  • Limited to predefined training data
  • No ability to learn from interactions
  • Performance may vary with complex or ambiguous tasks

Safety

  • Extensive safety evaluations across multiple languages
  • Enhanced handling of sensitive content
  • Rigorous safety standards maintained
  • Comprehensive policy domain testing
  • Regular safety assessments

Getting started

To use Claude 3.5 Haiku via Replicate:

import replicate

output = replicate.run(
    "anthropic/claude-3.5-haiku",
    input={
        "prompt": "Your prompt here"
    }
)

Privacy policy & license

Data from this model is sent to Anthropic and Google Cloud Vertex AI.

Usage of this model is subject to Anthropic’s terms of service. Please refer to their website for full terms and conditions.