GPT-4o mini

Available standard

Small model, big intelligence

Released July 18, 2024

API Documentation View Provider

Context Window

128K

TTFT

N/A

Speed

N/A

Max Output

16,384

Training Cutoff

Oct 2023

Last tested: 2026-01-15

About

GPT-4o mini is OpenAI's most cost-efficient small model, designed to replace GPT-3.5 Turbo. It offers significantly improved intelligence over its predecessor while maintaining fast response times and extremely low pricing, making it ideal for high-volume applications.

Capabilities

vision function-calling streaming json-mode structured-outputs

Pricing

Input: $0.15/M
Output: $0.60/M

Live pricing Static pricing

Batch

$0.075/M

per 1M tokens

Details

Overview

GPT-4o mini delivers exceptional value for developers seeking high-quality AI capabilities at minimal cost. As OpenAI’s recommended replacement for GPT-3.5 Turbo, it provides substantially better performance across reasoning, math, coding, and multimodal tasks while costing over 60% less.

Key Features

Exceptional Value: Industry-leading price-to-performance ratio at $0.15 per million input tokens
Vision Support: Process images alongside text for multimodal applications
Full Feature Parity: Access function calling, JSON mode, and structured outputs
High Throughput: Optimized for low-latency, high-volume production workloads
Extended Context: Full 128K context window matches larger models

Use Cases

GPT-4o mini is perfect for chatbots, content generation, summarization, classification, and any application requiring intelligent responses at scale. Its low cost makes it viable for consumer-facing products with millions of users.

Performance

Despite its small size, GPT-4o mini scores 82% on MMLU, significantly outperforming GPT-3.5 Turbo’s 70%. It handles complex instructions, maintains conversation context, and produces high-quality outputs reliably.