deepseek-chat-v3.1

completions

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config) The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows. It succeeds the [DeepSeek V3-0324](/deepseek/deepseek-chat-v3-0324) model and performs well on a variety of tasks.

Input:$0.64 / 1M tokens

Output:$1.65 / 1M tokens

Context:163840 tokens

text

Access deepseek-chat-v3.1 through LangDB AI Gateway

Recommended

Integrate with deepseek's deepseek-chat-v3.1 and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API

Cost Optimization

Enterprise Security

Get Started Now

Free tier available • No credit card required

Instant Setup

99.9% Uptime

10,000+Monthly Requests

Code Example

Configuration

Base URL

API Keys

Headers

Project ID in header

X-Run-Id

X-Thread-Id

Model Parameters

10 available

frequency_penalty

-202

max_tokens

min_p

001

presence_penalty

-201.999

repetition_penalty

012

seed

stop

temperature

012

top_k

top_p

011

Additional Configuration

Tools

Guards

User:

Id:

Name:

Tags:

Publicly Shared Threads0

Discover shared experiences

Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!

Share your threads to help others

Popular Models10

deepseek-chat-v3-0324
deepseek
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [DeepSeek V3](/deepseek/deepseek-chat-v3) model and performs really well on a variety of tasks.
Input:$0.79 / 1M tokens
Output:$1.15 / 1M tokens
Context:163840 tokens
text
text
llama-3.2-1b-instruct
meta-llama
Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate efficiently in low-resource environments while maintaining strong task performance. Supporting eight core languages and fine-tunable for more, Llama 1.3B is ideal for businesses or developers seeking lightweight yet powerful AI solutions that can operate in diverse multilingual settings without the high computational demand of larger models. Click here for the [original model card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD.md). Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).
Input:$0.01 / 1M tokens
Output:$0.07 / 1M tokens
Context:131072 tokens
text
text
gpt-4o-mini
openai
GPT-4o mini (o for omni) is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency.The knowledge cutoff for GPT-4o-mini models is October, 2023.
Input:$0.15 / 1M tokens
Output:$0.6 / 1M tokens
Context:128K tokens
tools
text
image
text
claude-sonnet-4
anthropic
Our high-performance model with exceptional reasoning and efficiency
Input:$3 / 1M tokens
Output:$15 / 1M tokens
Context:200K tokens
tools
text
image
text
claude-opus-4
anthropic
Our most capable and intelligent model yet. Claude Opus 4 sets new standards in complex reasoning and advanced coding
Input:$15 / 1M tokens
Output:$75 / 1M tokens
Context:200K tokens
tools
text
image
text
gemini-2.5-pro
gemini
Gemini 2.5 Pro is our most advanced reasoning Gemini model, capable of solving complex problems.
Input:$1.25 / 1M tokens
Output:$10 / 1M tokens
Context:1M tokens
tools
text
image
audio
video
text
gpt-4.1
openai
GPT-4.1 is OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains.
Input:$2 / 1M tokens
Output:$8 / 1M tokens
Context:1047576 tokens
tools
text
image
text
gemini-2.5-pro-preview
gemini
Gemini 2.5 Pro Experimental is Google's state-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context.
Input:$1.25 / 1M tokens
Output:$10 / 1M tokens
Context:1M tokens
tools
text
image
audio
video
text
grok-4
xai
Grok 4 is the latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades.
Input:$3 / 1M tokens
Output:$15 / 1M tokens
Context:256K tokens
tools
text
text
gemini-2.5-flash
gemini
Google's best model in terms of price-performance, offering well-rounded capabilities.
Input:$0.15 / 1M tokens
Output:$0.6 / 1M tokens
Context:1M tokens
tools
text
image
audio
video
text