What is the pricing for Kimi k2.6?

Kimi k2.6 costs $0.95 per 1 million input tokens and $4.00 per 1 million output tokens. For cached input, the price drops to $0.16 per million tokens.

How do I access the Kimi k2.6 API?

Access the API through the Moonshot AI platform at platform.kimi.ai using an OpenAI-compatible SDK. The base URL is https://api.moonshot.ai/v1.

Does Kimi k2.6 support video input?

Yes, it supports native video input in formats like MP4, MOV, and WEBM for scene descriptions and motion analysis.

What is the context window size?

The model supports a 256,000-token context window, roughly equivalent to a 300-page book.

What is a Thinking model?

Thinking mode allows the model to generate internal chain-of-thought reasoning before answering, which improves performance on hard logic tasks.

Is Kimi k2.6 open source?

Kimi k2.6 is an open-weights model, meaning weights are available for download on platforms like Hugging Face for local hosting.

What are Agent Swarms?

Agent Swarms allow the model to spin up 300 parallel sub-agents to handle massive tasks across 100 or more files simultaneously.

What are the hardware requirements for local hosting?

Running the full 1T-parameter model locally requires approximately 600GB of VRAM, though quantized versions can run on smaller setups.

Kimi k2.6

Kimi k2.6 is Moonshot AI's 1T-parameter MoE model featuring a 256K context window, native video input, and elite performance in autonomous agentic coding.

ReasoningMultimodalCoding AgentOpen WeightsMoE

moonshotKimiApril 20, 2026

Context

256Ktokens

Max Output

33Ktokens

Input Price

$0.95/ 1M

Output Price

$4.00/ 1M

Modality:TextImageVideo

Capabilities:VisionToolsStreamingReasoning

Benchmarks

GPQA

90.5%

HLE

54%

MMLU

86.4%

MMLU Pro

84.6%

SimpleQA

43%

IFEval

89.8%

AIME 2025

97.3%

MATH

98.2%

GSM8k

97.3%

MGSM

91.5%

MathVista

67.1%

SWE-Bench

80.2%

HumanEval

92%

LiveCodeBench

83.1%

MMMU

77.3%

MMMU Pro

75.6%

ChartQA

87.4%

DocVQA

94.9%

Terminal-Bench

60.2%

ARC-AGI

68.8%

View API Documentation

About Kimi k2.6

Learn about Kimi k2.6's capabilities, features, and how it can help you achieve better results.

Architectural Design and Scale

Kimi k2.6 is a frontier multimodal Mixture-of-Experts (MoE) model featuring a trillion-parameter scale. It uses 32 billion active parameters per token, balancing computational efficiency with high-level cognitive performance. The architecture supports internal chain-of-thought reasoning, where the model generates hidden reasoning steps before outputting a final response. This design allows it to tackle complex, multi-step problems that typically stall standard large language models.

Agentic Intelligence and Coordination

The model is specifically optimized for autonomous software engineering and long-horizon tasks. It can manage Agent Swarms of up to 300 parallel sub-agents, which coordinate to refactor large codebases or manage complex DevOps pipelines. By using native tool calling and visual understanding, Kimi k2.6 operates as an autonomous agent capable of resolving multi-file GitHub issues and creating motion-rich web interfaces from visual references.

Multimodal Capabilities

Native support for video and image inputs distinguishes Kimi k2.6 from many open-weight peers. It processes video files directly to perform scene analysis, bug reproduction, and structured data extraction. The model serves as a visual architect, generating 3D shaders and complex animations using libraries like Three.js and GSAP based on visual descriptions or uploaded mockups.

Use Cases

Discover the different ways you can use Kimi k2.6 to achieve great results.

Autonomous Software Engineering

Resolving complex GitHub issues by coordinating up to 300 parallel sub-agents over 12-hour sessions.

Motion-Rich Frontend Generation

Creating modern web interfaces with WebGL and GSAP shaders from single text or image prompts.

Deep Video Analysis

Analyzing recordings to perform visual bug reproduction, scene description, or structured data extraction.

Agentic Market Research

Executing multi-step web searches and tool calls to synthesize competitive analysis reports from hundreds of sources.

Legacy Code Optimization

Identifying performance bottlenecks in older codebases by analyzing CPU flame graphs and allocation data.

Scientific Problem Solving

Answering graduate-level science and math questions using Python-assisted reasoning and tool verification.

Strengths

Limitations

Superior Agentic Coding: Achieves an 80.2% score on SWE-Bench Verified, placing it among the most capable models for autonomous engineering.

High Local VRAM Requirements: Running the full model locally requires 600GB of VRAM, limiting self-hosting to specialized high-end workstations.

Massive Coordination Scale: Manages 300 parallel sub-agents, allowing it to handle enterprise-level refactoring tasks in a single pass.

Regional API Latency: Infrastructure is optimized for Asia, which can lead to higher response times for users in Western regions.

Native Multimodal Versatility: Supports native video and image inputs, enabling advanced visual-language agent workflows for UI/UX tasks.

Recall Gaps in Long Context: The model can struggle with perfect recall at the extreme edges of its 256,000-token buffer.

Aggressive Pricing Advantage: At $0.95 per million input tokens, it is significantly cheaper than proprietary competitors like Claude 3.7 or GPT-4o.

Restricted Commercial License: The open-weights release uses a modified license requiring specific compliance for large-scale enterprise deployments.

API Quick Start

moonshotai/kimi-k2.6

View Documentation

moonshot SDK

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.MOONSHOT_API_KEY,
  baseURL: "https://api.moonshot.ai/v1",
});

async function main() {
  const completion = await client.chat.completions.create({
    model: "kimi-k2.6",
    messages: [
      { role: "system", content: "You are a coding expert." },
      { role: "user", content: "Optimize this Rust function for throughput." }
    ],
    extra_body: { thinking: { type: "enabled" } }
  });

  console.log(completion.choices[0].message.content);
}

main();

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about Kimi k2.6

“Meet Kimi K2.6: Advancing Open-Source Coding. One prompt, 100+ files. 4,000+ tool calls over 12 hours of continuous execution.”

— @Kimi_Moonshot

twitter

“Kimi 2.6 BEATS Opus 4.7 And Is The BEST Open Source Model In The World. It's a very good model at 10x less cost.”

— @bindureddy

twitter

“The pricing delta is the part nobody is pricing in. Kimi K2.6 is 5x cheaper than Sonnet 4.6. The benchmark gap has officially inverted.”

— @aakashgupta

twitter

“I tried it against a bug I had. It resolved it successfully for a little over $1. It was a difficult bug that Sonnet struggled with.”

— @uworldhits1391

youtube

“Kimi K2.6 is transformative, though it has room for recall improvements in ultra-long tasks. Still, 300 parallel agents is insane.”

— @Radiant-Act4707

“The Kimi K2 series marks the moment where open-source frontier labs are finally rivaling and surpassing closed-source giants.”

— @zxytim

twitter

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents

Web Automation

Smart Workflows

Get Started Free

Pro Tips

Expert tips to help you get the most out of Kimi k2.6 and achieve better results.

Enable Tool Use for Reasoning

Benchmarks show the HLE score jumps from 23.9 to 54.0 when the model is allowed external search and computation tools.

Monitor Context Buffer Edges

Recall is most accurate in the first 200,000 tokens of the 256,000-token buffer.

Use Thinking Mode Sparingly

Disable the thinking parameter for simple chat tasks to reduce latency and total token consumption.

Standardize with XML Tags

The model follows instructions more accurately when context and tasks are wrapped in XML tags.

Leverage Native Video Uploads

Use file upload methods rather than base64 encoding for videos over 100MB to avoid request size limits.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Mohammed Ibrahim

CEO, qannas.pro

Ben Bressington

CTO, AiChatSolutions

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

Gemini 3 Flash

Google

Gemini 3 Flash is Google's high-speed multimodal model featuring a 1M token context window, elite 90.4% GPQA reasoning, and autonomous browser automation tools.

1M context

$0.50/$3.00/1M

DeepSeek v4

DeepSeek

DeepSeek v4 is a 1.6T parameter MoE model featuring a 1M token context window and native multimodal support for text, vision, and video at disruptive prices.

1M context

$1.74/$3.48/1M

Claude Sonnet 4.6

Anthropic

Claude Sonnet 4.6 offers frontier performance for coding and computer use with a massive 1M token context window for only $3/1M tokens.

1M context

$3.00/$15.00/1M

Claude Opus 4.6

Anthropic

Claude Opus 4.6 is Anthropic's flagship model featuring a 1M token context window, Adaptive Thinking, and world-class coding and reasoning performance.

1M context

$5.00/$25.00/1M

Gemini 3 Pro

Google

Google's Gemini 3 Pro is a multimodal powerhouse featuring a 1M token context window, native video processing, and industry-leading reasoning performance.

1M context

$2.00/$12.00/1M

Qwen 3.7 Max

alibaba

Qwen 3.7 Max is Alibaba’s flagship AI model for deep reasoning and autonomous agent tasks, featuring a 256k context window and top-tier coding performance.

256K context

$1.20/$6.00/1M

Claude Fable 5

Anthropic

Anthropic's Claude Fable 5 is a Mythos-class model featuring a 1M context window and 128K output tokens. It excels at agentic coding and 3D physics.

1M context

$10.00/$50.00/1M

Qwen3.5-397B-A17B

alibaba

Qwen3.5-397B-A17B is Alibaba's flagship open-weight MoE model. It features native multimodal reasoning, a 1M context window, and a 19x decoding throughput...

1M context

$0.40/$2.40/1M

Frequently Asked Questions

Find answers to common questions about Kimi k2.6

Kimi k2.6

About Kimi k2.6

Architectural Design and Scale

Agentic Intelligence and Coordination

Multimodal Capabilities

Use Cases

Autonomous Software Engineering

Motion-Rich Frontend Generation

Deep Video Analysis

Agentic Market Research

Legacy Code Optimization

Scientific Problem Solving

Strengths

Limitations

API Quick Start

Community Feedback

Related Videos

Supercharge your workflow with AI Automation

Pro Tips

Enable Tool Use for Reasoning

Monitor Context Buffer Edges

Use Thinking Mode Sparingly

Standardize with XML Tags

Leverage Native Video Uploads

What Our Users Say

Related AI Models

Gemini 3 Flash

DeepSeek v4

Claude Sonnet 4.6

Claude Opus 4.6

Gemini 3 Pro

Qwen 3.7 Max

Claude Fable 5

Qwen3.5-397B-A17B

Frequently Asked Questions

What is the pricing for Kimi k2.6?

How do I access the Kimi k2.6 API?

Does Kimi k2.6 support video input?

What is the context window size?

What is a Thinking model?

Is Kimi k2.6 open source?

What are Agent Swarms?

What are the hardware requirements for local hosting?