Gemini 2.0 Flash

Available frontier

Google's fastest multimodal model with native tool use

Released December 11, 2024

API Documentation View Provider

Context Window

TTFT

N/A

Speed

N/A

Max Output

Training Cutoff

Aug 2024

Last tested: 2026-01-15

About

Gemini 2.0 Flash is Google's next-generation multimodal AI model designed for speed and efficiency. It features native image and audio output capabilities, real-time streaming, and built-in Google Search grounding. The model excels at agentic tasks with native tool use and supports multiple output modalities including text, images, and audio.

Capabilities

vision audio function-calling streaming code-execution grounding

Pricing

Input: $0.075/M
Output: $0.30/M

Live pricing Static pricing

Details

Overview

Gemini 2.0 Flash represents a significant leap forward in Google’s AI capabilities, combining speed with multimodal intelligence. As the successor to Gemini 1.5 Flash, this model doubles down on efficiency while introducing groundbreaking features like native image and audio generation.

Key Features

Native Multimodal Output

Unlike previous versions that only accepted multimodal inputs, Gemini 2.0 Flash can generate images and audio natively, enabling more creative and dynamic applications.

Real-Time Streaming

The Multimodal Live API enables real-time audio and video streaming interactions, making it ideal for conversational AI assistants and interactive applications.

Built-In Tool Use

Gemini 2.0 Flash features native tool use capabilities including Google Search grounding, code execution, and third-party function calling, reducing the need for complex orchestration.

Performance

Gemini 2.0 Flash achieves twice the speed of its predecessor while maintaining competitive quality scores across benchmarks. The model excels at:

Complex reasoning tasks
Code generation and analysis
Multimodal understanding
Real-time conversational AI

Use Cases

AI Agents: Build autonomous agents that can browse, search, and execute code
Creative Applications: Generate images and audio alongside text responses
Real-Time Assistants: Voice and video-enabled conversational interfaces
Enterprise Search: Grounded responses with Google Search integration

Provider

Live

Google AI: Gemini models with massive context and multimodal capabilities
Models Hosted
API Style: Google Cloud / AI Studio
Compute Location

Other Google AI Models

See all models

Preview

Gemini 2.0 Flash Thinking: Experimental reasoning model with transparent thought process
Context: 1M
Speed: N/A
TTFT: N/A

Live

Gemini 1.5 Pro: Google's flagship model with 2M token context window
Context: 2M
Speed: N/A
TTFT: N/A

Live

Gemini 1.5 Flash: Fast, efficient multimodal model for high-volume tasks
Context: 1M
Speed: N/A
TTFT: N/A

Gemini 2.0 Flash

About

Capabilities

Pricing

Details

Overview

Key Features

Native Multimodal Output

Real-Time Streaming

Built-In Tool Use

Performance

Use Cases

Provider

Other Google AI Models

Models

Providers

Resources

Legal

Gemini 2.0 Flash

About

Capabilities

Pricing

Details

Overview

Key Features

Native Multimodal Output

Real-Time Streaming

Built-In Tool Use

Performance

Use Cases

Provider

Other Google AI Models

Get the signal, skip the noise.

Models

Providers

Resources

Legal