minimax

MiniMax M2.5

MiniMax M2.5 is a SOTA MoE model featuring a 1M context window and elite agentic coding capabilities at disruptive pricing for autonomous agents.

Agentic AIMoE ArchitectureCoding SpecialistCost Efficient
minimax logominimaxM-seriesFebruary 12, 2026
Context
1.0Mtokens
Max Output
8Ktokens
Input Price
$0.15/ 1M
Output Price
$1.20/ 1M
Modality:Text
Capabilities:ToolsStreamingReasoning
Benchmarks
GPQA
47%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). MiniMax M2.5 scored 47% on this benchmark.
HLE
32%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. MiniMax M2.5 scored 32% on this benchmark.
MMLU
82%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. MiniMax M2.5 scored 82% on this benchmark.
MMLU Pro
74%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. MiniMax M2.5 scored 74% on this benchmark.
SimpleQA
42%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. MiniMax M2.5 scored 42% on this benchmark.
IFEval
88%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. MiniMax M2.5 scored 88% on this benchmark.
AIME 2025
78%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. MiniMax M2.5 scored 78% on this benchmark.
MATH
78%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. MiniMax M2.5 scored 78% on this benchmark.
GSM8k
96%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. MiniMax M2.5 scored 96% on this benchmark.
MGSM
94%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. MiniMax M2.5 scored 94% on this benchmark.
MathVista
65%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. MiniMax M2.5 scored 65% on this benchmark.
SWE-Bench
80.2%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. MiniMax M2.5 scored 80.2% on this benchmark.
HumanEval
92%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. MiniMax M2.5 scored 92% on this benchmark.
LiveCodeBench
65%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. MiniMax M2.5 scored 65% on this benchmark.
MMMU
68%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. MiniMax M2.5 scored 68% on this benchmark.
MMMU Pro
52%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. MiniMax M2.5 scored 52% on this benchmark.
ChartQA
85%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. MiniMax M2.5 scored 85% on this benchmark.
DocVQA
92%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. MiniMax M2.5 scored 92% on this benchmark.
Terminal-Bench
57%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. MiniMax M2.5 scored 57% on this benchmark.
ARC-AGI
9%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. MiniMax M2.5 scored 9% on this benchmark.

About MiniMax M2.5

Learn about MiniMax M2.5's capabilities, features, and how it can help you achieve better results.

Efficient Frontier Architecture

MiniMax M2.5 is a high-efficiency frontier model built on a 230B Mixture-of-Experts (MoE) architecture. By activating only 10 billion parameters per forward pass, it achieves inference speeds and pricing structures that are nearly 20 times more efficient than proprietary giants. It is engineered specifically for agentic intelligence, prioritizing structured logic and multi-step planning over simple chat completions. This sparse design enables the model to maintain high intelligence without the massive compute overhead of traditional dense models.

Advanced Coding Intelligence

The model's standout feature is its Architect Mindset, which allows it to visualize logic structures and project hierarchies before generating code. This makes it particularly effective for autonomous software engineering, where it matches the state-of-the-art with an 80.2% score on SWE-Bench Verified. With a 1-million-token context window, it can ingest entire codebases, enabling deep repository audits and complex system refactoring that were previously cost-prohibitive.

Enterprise and Local Deployment

MiniMax M2.5 supports over 10 programming languages and native throughput of up to 100 tokens per second on its lightning variant. Because it is available as an open-weight model, developers can deploy it locally for full data privacy while retaining access to the same logic-heavy reasoning found in the hosted API. This versatility makes it a practical choice for both cloud-based agent pipelines and on-premise development tools.

MiniMax M2.5

Use Cases

Discover the different ways you can use MiniMax M2.5 to achieve great results.

Autonomous Software Engineering

Resolving real-world GitHub issues and performing multi-file debugging using agent harnesses.

Enterprise Agent Pipelines

Powering always-on background agents for research and data synthesis at low API costs.

Legacy Code Modernization

Refactoring massive outdated repositories into modern frameworks while maintaining logic standards.

Architectural Code Reviews

Analyzing project hierarchies to provide logic feedback and structural optimization suggestions.

High-Volume Document Editing

Processing large office files with high fidelity for financial and legal modeling.

Low-Latency Developer Tools

Driving IDE extensions and CLI tools that require sub-second response times for assistance.

Strengths

Limitations

SOTA Coding Performance: Achieves an 80.2% score on SWE-Bench Verified, matching the performance of much more expensive models.
Lower Reasoning Depth: The sparse 10B active parameters can occasionally lag behind dense models in extremely niche reasoning tasks.
Extreme Cost Efficiency: Pricing is approximately 1/20th of major competitors, making large-scale agent deployments viable.
Text-Centric Focus: Lacks native vision and audio capabilities compared to multimodal models like GPT-4o.
High Throughput: The HighSpeed variant delivers 100 tokens per second, which is double the speed of traditional models.
Brand Attribution Required: Commercial use of the open-weight version requires prominent attribution to the MiniMax brand.
Open-Weight Availability: Developers can run the model locally to ensure data privacy and full stack ownership.
VRAM Requirements: Running the full model locally requires high-end hardware unless utilizing significant quantization.

API Quick Start

minimax/minimax-m2.5

View Documentation
minimax SDK
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.MINIMAX_API_KEY,
  baseURL: 'https://api.minimax.io/v1',
});

async function main() {
  const response = await client.chat.completions.create({
    model: 'minimax-m2.5',
    messages: [{ role: 'user', content: 'Design a microservices architecture for a fintech app.' }],
    temperature: 0.1,
  });
  console.log(response.choices[0].message.content);
}

main();

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about MiniMax M2.5

MiniMax M2.5 pricing is the real story, cheap enough to change architecture, not just budgets.
PretendAd7988
twitter
M2.5 is hitting SOTA numbers and it's a 10B active parameter model, meaning it's fast and cheap.
Low-Bread-2346
reddit
The model reduces the heavy lifting users had to do just to keep things moving.
JamMasterJulian
youtube
M2.5 is matching Claude Opus 4.6 throughput at a fraction of the cost.
Significant-Tap-7854
reddit
Running M2.5 locally on a Mac Studio is snappy. The 10B active params really make a difference.
MacCoder_X
reddit
The architectural planning step catches logic errors before it even writes a single line of code.
dev_mindset
twitter

Related Videos

Watch tutorials, reviews, and discussions about MiniMax M2.5

It's almost 20 times cheaper than the top proprietary options.

This is a top tier coding and agentic model that's much faster and drastically cheaper.

The performance on SWE-bench verified really puts it in the elite category.

You're getting frontier intelligence at open-source hardware requirements.

The MoE architecture here is tuned perfectly for low-latency coding tasks.

MiniMax is serving the model at 3% of the cost of Opus 4.6 in output tokens.

The cost of intelligence is actually nearing the cost of electricity at this point.

It handles large repo context windows without the typical mid-doc forgetting.

For developer tools, the speed of the lightning variant is a massive UX win.

It's the first time I've seen a model this cheap actually solve complex logic bugs.

It costs just $1 to run the model continuously for an hour at 100 tokens per second.

The inner thinking really shines here because it can course correct immediately.

Testing it against GPT-4o, it consistently provides better multi-file refactors.

The agentic capabilities are built-in, not just an afterthought in the prompt.

It's essentially free for small developers given the input pricing tiers.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips

Expert tips to help you get the most out of MiniMax M2.5 and achieve better results.

Adopt the Architect Mindset

Ask the model to generate a project structure before requesting the actual implementation code.

Utilize the 1M Context

Provide complete documentation or entire modules to ensure global awareness of your codebase.

Use the HighSpeed Plan

Select the M2.5-HighSpeed endpoint to achieve a steady 100 tokens per second for interactive agents.

Iterative Refinement

Ask the model to review its initial output for logic gaps or security vulnerabilities.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

zhipu

GLM-4.7

Zhipu (GLM)

GLM-4.7 by Zhipu AI is a flagship 358B MoE model featuring a 200K context window, elite 73.8% SWE-bench performance, and native Deep Thinking for agentic...

200K context
$0.60/$2.20/1M
alibaba

Qwen3-Coder-Next

alibaba

Qwen3-Coder-Next is Alibaba Cloud's elite Apache 2.0 coding model, featuring an 80B MoE architecture and 256k context window for advanced local development.

262K context
$0.12/$0.75/1M
openai

GPT-4o mini

OpenAI

OpenAI's most cost-efficient small model, GPT-4o mini offers multimodal intelligence and high-speed performance at a significantly lower price point.

128K context
$0.15/$0.60/1M
deepseek

DeepSeek-V3.2-Speciale

DeepSeek

DeepSeek-V3.2-Speciale is a reasoning-first LLM featuring gold-medal math performance, DeepSeek Sparse Attention, and a 131K context window. Rivaling GPT-5...

131K context
$0.28/$0.42/1M
openai

GPT-5.4

OpenAI

GPT-5.4 is OpenAI's frontier model featuring a 1.05M context window and Extreme Reasoning. It excels at autonomous UI interaction and long-form data analysis.

1M context
$2.50/$15.00/1M
google

Gemini 3.1 Flash-Lite

Google

Gemini 3.1 Flash-Lite is Google's fastest, most cost-efficient model. Features 1M context, native multimodality, and 363 tokens/sec speed for scale.

1M context
$0.25/$1.50/1M
openai

GPT-5.3 Instant

OpenAI

Explore GPT-5.3 Instant, OpenAI's "Anti-Cringe" model. Features a 128K context window, 26.8% fewer hallucinations, and a natural, helpful tone for everyday...

128K context
$1.75/$14.00/1M
google

Gemini 3.1 Pro

Google

Gemini 3.1 Pro is Google's elite multimodal model featuring the DeepThink reasoning engine, a 1M+ context window, and industry-leading ARC-AGI logic scores.

1M context
$2.00/$12.00/1M

Frequently Asked Questions

Find answers to common questions about MiniMax M2.5