gemini-2.5-flash-lite by openrouter - AI Model Details, Pricing, and Performance Metrics

langdb
gemini-2.5-flash-lite
langdb

gemini-2.5-flash-lite

completions
byopenrouter

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

Released
Jun 17, 2025
Knowledge
Jan 1, 2025
License
CC-BY-4.0
Context
1048576
Input
$0.1 / 1M tokens
Output
$0.4 / 1M tokens
Capabilities: tools
Accepts: text, image
Returns: text

Access gemini-2.5-flash-lite through LangDB AI Gateway

Recommended

Integrate with google's gemini-2.5-flash-lite and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API
Cost Optimization
Enterprise Security
Get Started Now

Free tier available • No credit card required

Instant Setup
99.9% Uptime
10,000+Monthly Requests
Request Volume
Daily API requests
2
Performance (TPS)
Tokens per second
4229.95 tokens/s

Category Scores

Benchmark Tests

View Other Benchmarks
AIME
50.0
Mathematics
AA Coding Index
28.9
Programming
AAII
30.1
General
AA Math Index
71.3
Mathematics
GPQA
64.6
STEM (Physics, Chemistry, Biology)
HLE
3.7
General Knowledge
LiveCodeBench
40.0
Programming
MATH-500
92.6
Mathematics
MMLU-Pro
72.4
General Knowledge
MMMU
72.9
General Knowledge
SciCode
17.7
Scientific

Code Examples

Integration samples and API usage