phi-4-multimodal-instruct

completions

bydeepinfra

phi-4-multimodal-instruct

completions

Published by: microsoftProvider:

deepinfra

Phi-4 Multimodal Instruct is a versatile 5.6B parameter foundation model that combines advanced reasoning and instruction-following capabilities across both text and visual inputs, providing accurate text outputs. The unified architecture enables efficient, low-latency inference, suitable for edge and mobile deployments. Phi-4 Multimodal Instruct supports text inputs in multiple languages including Arabic, Chinese, English, French, German, Japanese, Spanish, and more, with visual input optimized primarily for English. It delivers impressive performance on multimodal tasks involving mathematical, scientific, and document reasoning, providing developers and enterprises a powerful yet compact model for sophisticated interactive applications. For more information, see the [Phi-4 Multimodal blog post](https://azure.microsoft.com/en-us/blog/empowering-innovation-the-next-generation-of-the-phi-family/).

Released

Feb 1, 2025

Knowledge

Jun 1, 2024

License

MIT

Context

131072

Input

$0.05 / 1M tokens

Output

$0.1 / 1M tokens

Accepts: text, image

Returns: text

Released Feb 1, 2025Knowledge Cutoff: Jun 1, 2024License: MIT

Context: 131072 Input: $0.05 / 1M tokensOutput: $0.1 / 1M tokensAccepts: text, imageReturns: text

Access phi-4-multimodal-instruct through LangDB AI Gateway

Recommended

Integrate with microsoft's phi-4-multimodal-instruct and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API

Cost Optimization

Enterprise Security

Get Started Now

Free tier available • No credit card required

Instant Setup

99.9% Uptime

10,000+Monthly Requests

Benchmark Results for phi-4-multimodal-instruct

Category Performance Scores:

Vision: Score 55.10 (Top 88% - Rank #307)
Science: Score 37.10 (Top 93% - Rank #324)
Writing: Score 34.58 (Top 88% - Rank #307)
Academia: Score 20.75 (Top 95% - Rank #331)
Marketing: Score 36.75 (Top 78% - Rank #272)

Overall Performance: 36.856 average score across all categories

Detailed Benchmark Scores:

Benchmark	Score	Percentile	Domain
HLE	4.40	Top 71%	General Knowledge
AIME	9.30	Top 77%	Mathematics
GPQA	31.50	Top 95%	STEM (Physics, Chemistry, Biology)
MMMU	55.10	Top 88%	General Knowledge
SciCode	11.00	Top 95%	Scientific
MATH-500	69.30	Top 77%	Mathematics
MMLU-Pro	48.50	Top 93%	General Knowledge
LiveCodeBench	13.10	Top 94%	Programming
AAII	10.00	Top 93%	General

GPQA Score: 31.50 - Graduate-level reasoning benchmark

Model Comparison:

Provider: deepinfra

Model Type: completions

Context Size: 131072 tokens

Comparing against 348 models in the database

Category Scores

Benchmark Tests

View Other Benchmarks

HLE

4.4

General Knowledge

AIME

9.3

Mathematics

GPQA

31.5

STEM (Physics, Chemistry, Biology)

MMMU

55.1

General Knowledge

SciCode

11.0

Scientific

MATH-500

69.3

Mathematics

MMLU-Pro

48.5

General Knowledge

LiveCodeBench

13.1

Programming

AAII

10.0

General

Metric	HLE	AIME	GPQA	MMMU	SciCode	MATH-500	MMLU-Pro	LiveCodeBench	AAII
Score	4.4	9.3	31.5	55.1	11.0	69.3	48.5	13.1	10.0

Compare with Similar Models

claude-opus-4.5

claude-opus-4.6

gemini-3-flash-preview

claude-sonnet-4.5

gemini-3-pro-preview

claude-sonnet-4

Code Examples

Integration samples and API usage

Code Samples for phi-4-multimodal-instruct

Python SDK Example:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.us-east-1.langdb.ai/projects/<your_project_id>",
    api_key="<your_api_key>"
)

response = client.chat.completions.create(
    model="phi-4-multimodal-instruct",
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ]
)

print(response.choices[0].message.content)

TypeScript SDK Example:

import OpenAI from 'openai';

const client = new OpenAI({
    baseURL: "https://api.us-east-1.langdb.ai/projects/<your_project_id>",
    apiKey: "<your_api_key>"
});

const response = await client.chat.completions.create({
    model: "phi-4-multimodal-instruct",
    messages: [
        { role: "user", content: "Hello, how are you?" }
    ]
});

console.log(response.choices[0].message.content);

cURL Example:

curl -X POST "https://api.us-east-1.langdb.ai/projects/<your_project_id>/v1/chat/completions" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi-4-multimodal-instruct",
    "messages": [
        {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

Model: phi-4-multimodal-instruct

Provider: deepinfra

API Endpoint: $https://api.us-east-1.langdb.ai

Create API Key

Related Models

Similar models from deepinfra

phi-4-multimodal-instruct

phi-4-multimodal-instruct

Access phi-4-multimodal-instruct through LangDB AI Gateway

Category Scores

Benchmark Tests

Compare with Similar Models

Code Examples

Related Models

deepseek-chat-v3-0324

deepseek-chat-v3.1

deepseek-prover-v2

DeepSeek-R1

deepseek-r1-0528

DeepSeek-R1-Distill-Llama-70B

phi-4-multimodal-instruct by deepinfra - AI Model Details, Pricing, and Performance Metrics

phi-4-multimodal-instruct

phi-4-multimodal-instruct

Access phi-4-multimodal-instruct through LangDB AI Gateway

Category Scores

Benchmark Tests

Compare with Similar Models

Code Examples

Related Models

deepseek-chat-v3-0324

deepseek-chat-v3.1

deepseek-prover-v2

DeepSeek-R1

deepseek-r1-0528

DeepSeek-R1-Distill-Llama-70B