Anthropic

Claude 3.5 Haiku

Available standard

Fast and affordable model for high-volume production workloads

Released October 22, 2024

Context Window
200K
TTFT
N/A
Speed
N/A
Max Output
8,192
Training Cutoff
Apr 2024

Last tested: 2026-01-15

About

Claude 3.5 Haiku delivers impressive speed and capability at Anthropic's most affordable price tier. Ideal for real-time applications, chatbots, and high-throughput processing with 200K context support.

Capabilities

vision function-calling streaming

Pricing

Input
$0.80/M
per 1M tokens
Output
$4.00/M
per 1M tokens

Static pricing

Batch
$0.40/M input, $2.00/M output
per 1M tokens

Details

Overview

Claude 3.5 Haiku is Anthropic’s fastest model, optimized for production workloads requiring low latency and high throughput. Despite its speed, it maintains strong reasoning capabilities that surpass previous generation models like Claude 3 Opus on many benchmarks.

Key Strengths

Exceptional Speed: With response times under 70ms and throughput exceeding 180 tokens per second, Claude 3.5 Haiku is ideal for real-time applications where latency matters.

Cost Efficiency: At $0.80/M input tokens, it provides enterprise-grade AI capabilities at a fraction of frontier model pricing, making it viable for high-volume applications.

Capable Reasoning: Despite its speed focus, Haiku 3.5 delivers surprisingly strong performance on coding, analysis, and general reasoning tasks.

Use Cases

  • Customer support chatbots and virtual assistants
  • Real-time content moderation
  • High-volume document classification
  • Interactive coding assistants
  • Data extraction and processing pipelines

Performance Comparison

Claude 3.5 Haiku outperforms Claude 3 Opus on several benchmarks while being significantly faster and more affordable:

BenchmarkClaude 3.5 HaikuClaude 3 Opus
MMLU74.9%86.8%
GSM8K88.2%95.0%
HumanEval84.8%84.9%

API Integration

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-3-5-haiku-20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Classify this customer inquiry."}
    ]
)
Newsletter

Get the signal, skip the noise.

Weekly digest of new models and provider updates across 41+ compute providers. Curated for AI builders who ship.

New model releases
Capability updates
Provider status
bots.so
The AI Inference Model Index
© bots.so — The AI Inference Model Index

bots.so aggregates publicly available model deployment information from official provider sources. We are not affiliated with any model provider. Model availability changes rapidly; always verify on official sites.