kimi-linear-48b-a3b-instruct by openrouter - AI Model Details, Pricing, and Performance Metrics

moonshotai
kimi-linear-48b-a3b-instruct
Try
moonshotai

kimi-linear-48b-a3b-instruct

completions
byopenrouter

Kimi Linear is a hybrid linear attention architecture that outperforms traditional full attention methods across various contexts, including short, long, and reinforcement learning (RL) scaling regimes. At its core is Kimi Delta Attention (KDA)—a refined version of Gated DeltaNet that introduces a more efficient gating mechanism to optimize the use of finite-state RNN memory. Kimi Linear achieves superior performance and hardware efficiency, especially for long-context tasks. It reduces the need for large KV caches by up to 75% and boosts decoding throughput by up to 6x for contexts as long as 1M tokens.

Context
1048576
Input
$0.3 / 1M tokens
Output
$0.6 / 1M tokens
Accepts: text
Returns: text

Access kimi-linear-48b-a3b-instruct through LangDB AI Gateway

Recommended

Integrate with moonshotai's kimi-linear-48b-a3b-instruct and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API
Cost Optimization
Enterprise Security
Get Started Now

Free tier available • No credit card required

Instant Setup
99.9% Uptime
10,000+Monthly Requests

Code Examples

Integration samples and API usage