Back to providers page
Try this model
openrouter

shisa-v2-llama3.3-70b:free

completions

Shisa V2 Llama 3.3 70B is a bilingual Japanese-English chat model fine-tuned by Shisa.AI on Meta’s Llama-3.3-70B-Instruct base. It prioritizes Japanese language performance while retaining strong English capabilities. The model was optimized entirely through post-training, using a refined mix of supervised fine-tuning (SFT) and DPO datasets including regenerated ShareGPT-style data, translation tasks, roleplaying conversations, and instruction-following prompts. Unlike earlier Shisa releases, this version avoids tokenizer modifications or extended pretraining. Shisa V2 70B achieves leading Japanese task performance across a wide range of custom and public benchmarks, including JA MT Bench, ELYZA 100, and Rakuda. It supports a 128K token context length and integrates smoothly with inference frameworks like vLLM and SGLang. While it inherits safety characteristics from its base model, no additional alignment was applied. The model is intended for high-performance bilingual chat, instruction following, and translation tasks across JA/EN.

Input:Free
Output:Free
Context:32768 tokens
text
text

Access shisa-v2-llama3.3-70b:free through LangDB AI Gateway

Recommended

Integrate with shisa-ai's shisa-v2-llama3.3-70b:free and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API
Cost Optimization
Enterprise Security
Get Started Now

Free tier available • No credit card required

Instant Setup
99.9% Uptime
10,000+Monthly Requests
Code Example
Configuration
Base URL
API Keys
Headers
Project ID in header
X-Run-Id
X-Thread-Id
Model Parameters
13 available
frequency_penalty
-202
logit_bias
logprobs
max_tokens
min_p
001
presence_penalty
-201.999
repetition_penalty
012
seed
stop
temperature
012
top_k
top_logprobs
top_p
011
Additional Configuration
Tools
Guards
User:
Id:
Name:
Tags:
Publicly Shared Threads0

Discover shared experiences

Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!

Share your threads to help others
Popular Models10
  • openai
    gpt-4o-mini-search-preview
    openai
    GPT-4o mini Search Preview is a specialized model trained to understand and execute web search queries with the Chat Completions API.
    Input:$0.15 / 1M tokens
    Output:$0.6 / 1M tokens
    Context:128K tokens
    text
    text
  • openai
    gpt-4o-search-preview
    openai
    GPT-4o Search Preview is a specialized model trained to understand and execute web search queries with the Chat Completions API
    Input:$2.5 / 1M tokens
    Output:$10 / 1M tokens
    Context:128K tokens
    text
    text
  • anthropic
    claude-sonnet-4
    anthropic
    Our high-performance model with exceptional reasoning and efficiency
    Input:$3 / 1M tokens
    Output:$15 / 1M tokens
    Context:200K tokens
    tools
    text
    image
    text
  • anthropic
    claude-opus-4
    anthropic
    Our most capable and intelligent model yet. Claude Opus 4 sets new standards in complex reasoning and advanced coding
    Input:$15 / 1M tokens
    Output:$75 / 1M tokens
    Context:200K tokens
    tools
    text
    image
    text
  • openai
    gpt-4.1
    openai
    GPT-4.1 is OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains.
    Input:$2 / 1M tokens
    Output:$8 / 1M tokens
    Context:1047576 tokens
    tools
    text
    image
    text
  • gemini
    gemini-2.5-pro-preview
    gemini
    Gemini 2.5 Pro Experimental is Google's state-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context.
    Input:$1.25 / 1M tokens
    Output:$10 / 1M tokens
    Context:1M tokens
    tools
    text
    image
    audio
    video
    text
  • gemini
    gemini-2.5-flash-preview
    gemini
    Google's best model in terms of price-performance, offering well-rounded capabilities. Gemini 2.5 Flash rate limits are more restricted since it is an experimental / preview model.
    Input:$0.15 / 1M tokens
    Output:$0.6 / 1M tokens
    Context:1M tokens
    tools
    text
    image
    audio
    video
    text
  • gemini
    gemini-2.0-flash
    gemini
    Google's most capable multi-modal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents.
    Input:$0.1 / 1M tokens
    Output:$0.4 / 1M tokens
    Context:1M tokens
    tools
    text
    image
    audio
    video
    text
  • anthropic
    claude-3.7-sonnet
    anthropic
    Intelligent model, with visible step‑by‑step reasoning
    Input:$3 / 1M tokens
    Output:$15 / 1M tokens
    Context:200K tokens
    tools
    text
    text
    image
  • gemini
    gemini-2.0-flash-lite
    gemini
    Google's smallest and most cost effective model, built for at scale usage.
    Input:$0.07 / 1M tokens
    Output:$0.3 / 1M tokens
    Context:1M tokens
    text
    image
    audio
    video
    text

Related AI Model Resources

Explore more AI models, providers, and integration options:

  • Browse All AI Models
  • AI Providers Directory
  • More from openrouter
  • MCP Servers
  • Integration Documentation
  • Pricing & Plans
  • AI Industry Blog