other

MiMo V2.5 Pro

MiMo V2.5 Pro is Xiaomi's open-source 1.02T parameter MoE model featuring a 1M context window, native multimodality, and elite agentic coding performance.

Open SourceAgentic AIMultimodal1M ContextXiaomi
other logootherMiMoApril 27, 2026
Context
1.0Mtokens
Max Output
131Ktokens
Input Price
$1.00/ 1M
Output Price
$3.00/ 1M
Modality:TextImageAudioVideo
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
54%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). MiMo V2.5 Pro scored 54% on this benchmark.
HLE
48%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. MiMo V2.5 Pro scored 48% on this benchmark.
MMLU
86.7%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. MiMo V2.5 Pro scored 86.7% on this benchmark.
MMLU Pro
84.9%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. MiMo V2.5 Pro scored 84.9% on this benchmark.
SimpleQA
45%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. MiMo V2.5 Pro scored 45% on this benchmark.
IFEval
88%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. MiMo V2.5 Pro scored 88% on this benchmark.
AIME 2025
41%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. MiMo V2.5 Pro scored 41% on this benchmark.
MATH
75%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. MiMo V2.5 Pro scored 75% on this benchmark.
GSM8k
95.5%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. MiMo V2.5 Pro scored 95.5% on this benchmark.
MGSM
92%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. MiMo V2.5 Pro scored 92% on this benchmark.
MathVista
65%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. MiMo V2.5 Pro scored 65% on this benchmark.
SWE-Bench
78.9%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. MiMo V2.5 Pro scored 78.9% on this benchmark.
HumanEval
90%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. MiMo V2.5 Pro scored 90% on this benchmark.
LiveCodeBench
80.6%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. MiMo V2.5 Pro scored 80.6% on this benchmark.
MMMU
73%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. MiMo V2.5 Pro scored 73% on this benchmark.
MMMU Pro
52%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. MiMo V2.5 Pro scored 52% on this benchmark.
ChartQA
89%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. MiMo V2.5 Pro scored 89% on this benchmark.
DocVQA
93.5%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. MiMo V2.5 Pro scored 93.5% on this benchmark.
Terminal-Bench
68.4%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. MiMo V2.5 Pro scored 68.4% on this benchmark.
ARC-AGI
8%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. MiMo V2.5 Pro scored 8% on this benchmark.

About MiMo V2.5 Pro

Learn about MiMo V2.5 Pro's capabilities, features, and how it can help you achieve better results.

MiMo V2.5 Pro is Xiaomi's flagship open-source model. It uses a 1.02 trillion parameter Mixture-of-Experts architecture where 42 billion parameters are active during inference. The hybrid-attention design mixes Local Sliding Window Attention and Global Attention at a 6:1 ratio. This specific configuration reduces KV-cache storage requirements by nearly 7x compared to standard transformer models.

The model handles a 1-million-token context window while supporting native omnimodal inputs including text, image, audio, and video. It is optimized for long-horizon agentic tasks and autonomous tool use. Developers can run the model locally using FP8 precision weights, which balance memory usage with output throughput. The permissive MIT license allows for modification and commercial deployment without additional fees.

MiMo V2.5 Pro

Use Cases

Discover the different ways you can use MiMo V2.5 Pro to achieve great results.

Autonomous Software Engineering

Resolving GitHub issues and building system components like compilers with self-correcting logic.

Long-Horizon Agent Workflows

Executing plans requiring coherence across more than 1,000 tool calls in software environments.

Native Multimodal Analysis

Directly reasoning across combined inputs of video and text without external preprocessing or frame extraction.

Large-Scale Codebase Navigation

Ingesting entire project repositories within the 1M token context window to refactor logic or find bugs.

Analog Circuit Design

Optimizing complex circuits by interacting with simulation loops to meet multi-metric specifications.

3D Web Generation

Creating sophisticated environments and physics simulations using Three.js and procedural terrain generation.

Strengths

Limitations

Low Token Consumption: Delivers intelligence matching frontier models while using 40% to 60% fewer tokens per task trajectory.
Reasoning Latency: The deep thinking mode can result in delays of several minutes before the model begins generating text.
Long-Horizon Coherence: Maintains reasoning accuracy across context windows of 1 million tokens and sequences of over 1,000 tool calls.
Complex Platform Access: The official web portal has an unstable sign-in process that users frequently describe as difficult to navigate.
Software Engineering Performance: Reaches a 78.9% score on SWE-bench Verified, indicating high proficiency in resolving GitHub-level code issues.
Safety Refusal Patterns: Occasional refusals can occur at the very end of long thinking cycles, which consumes compute time without providing output.
Permissive MIT Licensing: Allows for commercial integration and weight modification without the restrictive terms found in other open-source licenses.
Significant Hardware Requirements: Hosting the 1.02T parameter model locally requires multi-GPU clusters, making self-hosting expensive for small teams.

API Quick Start

xiaomi/mimo-v2.5-pro

View Documentation
other SDK
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.xiaomimimo.com/v1",
  apiKey: process.env.MIMO_API_KEY
});

const completion = await client.chat.completions.create({
  model: "mimo-v2.5-pro",
  messages: [{ role: "user", content: "Identify logic errors in this 50,000 line codebase." }],
  thinking: { type: "enabled" }
});

console.log(completion.choices[0].message.content);

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about MiMo V2.5 Pro

The speed-to-context ratio on MiMo-V2.5-Pro is unbeatable for RAG pipelines that need to scan entire codebases in one go.
u/DevBuilder
reddit
China just matched USA frontier coding AI at 40-60% lower token cost. This isn't incremental; it's rewriting the game.
Shruti
twitter
MiMo-V2.5-Pro solved problems that would take human experts weeks. It built a complete compiler in just over 4 hours.
TechCrunchy
twitter
The model's value isn't just in benchmarks, but in its ability to sustain complex agent workflows without breaking.
XiaomiMiMo Team
hackernews
The speed is actually decent for a 1T model. The MoE routing is doing a lot of heavy lifting here.
AIExplorer
reddit
Finally an MIT licensed model that actually competes with the closed giants. Local deployment is the next hurdle.
OpenSourceFan
twitter

Related Videos

Watch tutorials, reviews, and discussions about MiMo V2.5 Pro

I've never seen that level of detail in a result... look at the individual wood panel floors.

The model is highly confident and effective when you feed it specific technical error messages.

It handles the entire codebase context without the usual middle-of-the-document loss.

The thinking process is transparent, showing exactly how it evaluates various tool options.

This model outperforms its predecessors in strict instruction following for JSON outputs.

It's designed to handle complex multi-step workflows, sustaining thousands of tool calls.

It is using 40 to 60% fewer tokens than models like GPT-5.4 or Claude Opus 4.6 at similar performance.

Xiaomi just shocked the open-source AI space with this release.

The native multimodality means it doesn't need a separate vision encoder for video.

You can effectively build a whole OS component by providing the right environment hooks.

Mimo came out to undercut everyone... the first month of the coding plan is only six dollars.

Benchmarks only tell part of the story; I want them to be actual builders and put the roof on properly.

It is much more stable than the earlier V2 release when handling long reasoning chains.

The pricing on their native API is aggressive, likely to capture the developer market.

It struggles slightly with very high-frequency audio but handles conversational speech perfectly.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips

Expert tips to help you get the most out of MiMo V2.5 Pro and achieve better results.

Manage Chain-of-Thought Latency

Add 'don't overthink' to your prompt to reduce reasoning latency for simple technical queries.

Preserve Reasoning Content

Pass back the previous reasoning_content in multi-turn conversations to maintain agentic performance.

Define Environment Affordances

Specify tool environment capabilities clearly as the model is optimized for harness awareness.

Optimize Local Deployment

Use FP8 mixed precision weights to balance memory efficiency with high output throughput.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

deepseek

DeepSeek-V3.2-Speciale

DeepSeek

DeepSeek-V3.2-Speciale is a reasoning-first LLM featuring gold-medal math performance, DeepSeek Sparse Attention, and a 131K context window. Rivaling GPT-5...

131K context
$0.28/$0.42/1M
minimax

MiniMax M2.5

minimax

MiniMax M2.5 is a SOTA MoE model featuring a 1M context window and elite agentic coding capabilities at disruptive pricing for autonomous agents.

1M context
$0.15/$1.20/1M
zhipu

GLM-4.7

Zhipu (GLM)

GLM-4.7 by Zhipu AI is a flagship 358B MoE model featuring a 200K context window, elite 73.8% SWE-bench performance, and native Deep Thinking for agentic...

200K context
$0.60/$2.20/1M
alibaba

Qwen3-Coder-Next

alibaba

Qwen3-Coder-Next is Alibaba Cloud's elite Apache 2.0 coding model, featuring an 80B MoE architecture and 256k context window for advanced local development.

262K context
$0.12/$0.75/1M
openai

GPT-4o mini

OpenAI

OpenAI's most cost-efficient small model, GPT-4o mini offers multimodal intelligence and high-speed performance at a significantly lower price point.

128K context
$0.15/$0.60/1M
alibaba

Qwen 3.7 Max

alibaba

Qwen 3.7 Max is Alibaba’s flagship AI model for deep reasoning and autonomous agent tasks, featuring a 256k context window and top-tier coding performance.

256K context
$1.20/$6.00/1M
alibaba

Qwen3.5-Omni

alibaba

Qwen3.5-Omni is a natively omnimodal AI by Alibaba Cloud, offering seamless audio-visual reasoning, real-time voice chat, and 256k context for low-latency apps.

256K context
$0.40/$4.80/1M
deepseek

DeepSeek v4

DeepSeek

DeepSeek v4 is a 1.6T parameter MoE model featuring a 1M token context window and native multimodal support for text, vision, and video at disruptive prices.

1M context
$1.74/$3.48/1M

Frequently Asked Questions

Find answers to common questions about MiMo V2.5 Pro