kimi-vl-a3b-thinking:free

completions

Kimi-VL is a lightweight Mixture-of-Experts vision-language model that activates only 2.8B parameters per step while delivering strong performance on multimodal reasoning and long-context tasks. The Kimi-VL-A3B-Thinking variant, fine-tuned with chain-of-thought and reinforcement learning, excels in math and visual reasoning benchmarks like MathVision, MMMU, and MathVista, rivaling much larger models such as Qwen2.5-VL-7B and Gemma-3-12B. It supports 128K context and high-resolution input via its MoonViT encoder.

Input:Free

Output:Free

Context:131072 tokens

text

image

text

Access kimi-vl-a3b-thinking:free through LangDB AI Gateway

Recommended

Integrate with moonshotai's kimi-vl-a3b-thinking:free and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API

Cost Optimization

Enterprise Security

Get Started Now

Free tier available • No credit card required

Instant Setup

99.9% Uptime

10,000+Monthly Requests

Code Example

Configuration

Base URL

API Keys

Headers

Project ID in header

X-Run-Id

X-Thread-Id

Model Parameters

14 available

frequency_penalty

-202

include_reasoning

logit_bias

logprobs

max_tokens

min_p

001

presence_penalty

-201.999

repetition_penalty

012

seed

stop

temperature

012

top_k

top_logprobs

top_p

011

Additional Configuration

Tools

Guards

User:

Id:

Name:

Tags:

Publicly Shared Threads0

Discover shared experiences

Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!

Share your threads to help others

Popular Models10

claude-sonnet-4
anthropic
Our high-performance model with exceptional reasoning and efficiency
Input:$3 / 1M tokens
Output:$15 / 1M tokens
Context:200K tokens
tools
text
image
text
claude-opus-4
anthropic
Our most capable and intelligent model yet. Claude Opus 4 sets new standards in complex reasoning and advanced coding
Input:$15 / 1M tokens
Output:$75 / 1M tokens
Context:200K tokens
tools
text
image
text
gpt-4.1
openai
GPT-4.1 is OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains.
Input:$2 / 1M tokens
Output:$8 / 1M tokens
Context:1047576 tokens
tools
text
image
text
gemini-2.5-pro-preview
gemini
Gemini 2.5 Pro Experimental is Google's state-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context.
Input:$1.25 / 1M tokens
Output:$10 / 1M tokens
Context:1M tokens
tools
text
image
audio
video
text
grok-4
xai
Grok 4 is the latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades.
Input:$3 / 1M tokens
Output:$15 / 1M tokens
Context:256K tokens
tools
text
text
gemini-2.5-flash-preview
gemini
Google's best model in terms of price-performance, offering well-rounded capabilities. Gemini 2.5 Flash rate limits are more restricted since it is an experimental / preview model.
Input:$0.15 / 1M tokens
Output:$0.6 / 1M tokens
Context:1M tokens
tools
text
image
audio
video
text
gemini-2.0-flash
gemini
Google's most capable multi-modal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents.
Input:$0.1 / 1M tokens
Output:$0.4 / 1M tokens
Context:1M tokens
tools
text
image
audio
video
text
claude-3.7-sonnet
anthropic
Intelligent model, with visible step‑by‑step reasoning
Input:$3 / 1M tokens
Output:$15 / 1M tokens
Context:200K tokens
tools
text
text
image
gemini-2.0-flash-lite
gemini
Google's smallest and most cost effective model, built for at scale usage.
Input:$0.07 / 1M tokens
Output:$0.3 / 1M tokens
Context:1M tokens
text
image
audio
video
text
gpt-4.1-mini
openai
GPT-4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases.
Input:$0.4 / 1M tokens
Output:$1.6 / 1M tokens
Context:1047576 tokens
tools
text
image
text