Meta launches new AI chatbot features: actively sending messages to improve interactive experience
Abacus.AI launches DeepAgent, all-round AI assistant leading the intelligent transformation of enterprises
In the era of big models, where will general visual models go?
X platform pilots AI to generate "community notes", Grok access information verification process
Gemini supports a variety of models that vary according to their functionality and target application scenarios. Here are some of the main Gemini models:
Gemini 2.5 Proexperiment
The most powerful thinking model with the highest answer accuracy and state-of-the-art performance
Functional Features
Enter audio, pictures, videos, and text to get text replies
Solve difficult problems, analyze large databases, etc.
Best for handling complex coding, inference and multimodal understanding
Gemini 2.0 Flashspark
Our latest multimodal model with next-generation features and enhanced capabilities
Functional Features
Enter audio, pictures, videos, and text to get text replies
Generate code and pictures, extract data, analyze files, generate charts, etc.
Low latency, high performance, built specifically for a proxy experience
Gemini 2.0 Flash-Lite
A Gemini 2.0 Flash model optimized for cost-effectiveness and low latency
Functional Features
Enter audio, pictures, videos, and text to get text replies
Better than 1.5 Flash in most benchmarks
1 million token context windows and multimodal inputs (such as Flash 2.0)
Model variants
The Gemini API provides different models optimized for specific use cases. Here is a brief introduction to the available Gemini variants:
Model variants | enter | Output | Optimization goals |
---|---|---|---|
Gemini 2.5 Pro Previewgemini-2. 5-pro-preview-03-25 | Audio, pictures, videos and text | text | Enhanced thinking and reasoning skills, multimodal understanding, advanced coding, etc. |
Gemini 2.0 Flashgemini-2. 0-flash | Audio, pictures, videos and text | Text, pictures (experimental features) and audio (coming soon) | Next-generation features, speed, thinking, real-time streaming and multi-mode generation |
Gemini 2.0 Flash-Litegemini-2. 0-flash-lite | Audio, pictures, videos and text | text | Cost-effective and short delay time |
Gemini 1.5 Flashgemini-1. 5-flash | Audio, pictures, videos and text | text | Provide fast and diverse performance across a variety of tasks |
Gemini 1.5 Flash-8Bgemini-1. 5-flash-8b | Audio, pictures, videos and text | text | Tasks with high volume and low intelligence |
Gemini 1.5 Progemini-1. 5-pro | Audio, pictures, videos and text | text | Complex reasoning tasks requiring higher intelligence |
Gemini Embedgemini-embedding-exp | text | Text embedding | Measuring the correlation of text strings |
Imagen 3imagen-3. 0-generate-002 | text | picture | Our most advanced image generation model |
Veo 2veo-2. 0-generate-001 | Text, pictures | video | Generate high-quality videos |
Gemini 2.0 Flash Livegemini-2. 0-flash-live-001 | Audio, video and text | Text, audio | Low latency two-way voice and video interaction |