mercury-2 by openrouter - AI Model Details, Pricing, and Performance Metrics

inception

mercury-2

completions
byopenrouter

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).

Released
Feb 24, 2026
Knowledge
Aug 28, 2025
License
Proprietary
Context
128K
Input
$0.25 / 1M tokens
Output
$0.75 / 1M tokens
Cached
$0.03 / 1M tokens
Capabilities: tools, reasoning
Accepts: text
Returns: text

Access mercury-2 through LangDB AI Gateway

Recommended

Integrate with inception's mercury-2 and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API
Cost Optimization
Enterprise Security
Get Started Now

Free tier available • No credit card required

Instant Setup
99.9% Uptime
10,000+Monthly Requests

Category Scores

Benchmark Tests

View Other Benchmarks
HLE
15.5
General Knowledge
GPQA
75.5
STEM (Physics, Chemistry, Biology)
SciCode
38.7
Scientific
AA Coding Index
30.6
Programming
AAII
32.8
General

Code Examples

Integration samples and API usage