anthropic

Claude Fable 5

Anthropic's Claude Fable 5 is a Mythos-class model featuring a 1M context window and 128K output tokens. It excels at agentic coding and 3D physics.

AnthropicMythos-ClassAgentic CodingReasoning1M Context
anthropic logoanthropicClaudeJune 9, 2026
Context
1.0Mtokens
Max Output
128Ktokens
Input Price
$10.00/ 1M
Output Price
$50.00/ 1M
Modality:TextImage
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
88.5%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). Claude Fable 5 scored 88.5% on this benchmark.
HLE
42%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. Claude Fable 5 scored 42% on this benchmark.
MMLU
91.2%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. Claude Fable 5 scored 91.2% on this benchmark.
MMLU Pro
82%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. Claude Fable 5 scored 82% on this benchmark.
SimpleQA
54%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. Claude Fable 5 scored 54% on this benchmark.
IFEval
92%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. Claude Fable 5 scored 92% on this benchmark.
AIME 2025
90%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. Claude Fable 5 scored 90% on this benchmark.
MATH
91.2%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. Claude Fable 5 scored 91.2% on this benchmark.
GSM8k
97.8%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. Claude Fable 5 scored 97.8% on this benchmark.
MGSM
96%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. Claude Fable 5 scored 96% on this benchmark.
MathVista
71%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. Claude Fable 5 scored 71% on this benchmark.
SWE-Bench
72%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. Claude Fable 5 scored 72% on this benchmark.
HumanEval
93.5%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. Claude Fable 5 scored 93.5% on this benchmark.
LiveCodeBench
76%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. Claude Fable 5 scored 76% on this benchmark.
MMMU
74.3%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. Claude Fable 5 scored 74.3% on this benchmark.
MMMU Pro
58%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. Claude Fable 5 scored 58% on this benchmark.
ChartQA
92%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. Claude Fable 5 scored 92% on this benchmark.
DocVQA
95%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. Claude Fable 5 scored 95% on this benchmark.
Terminal-Bench
55%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. Claude Fable 5 scored 55% on this benchmark.
ARC-AGI
12%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. Claude Fable 5 scored 12% on this benchmark.

About Claude Fable 5

Learn about Claude Fable 5's capabilities, features, and how it can help you achieve better results.

Claude Fable 5 is Anthropic's most powerful generally available model, built on the Mythos architecture class. It is designed for high-stakes autonomous tasks that require deep reasoning and a massive memory buffer. With a 1,000,000 token context window, it can ingest an entire company's codebase or hundreds of research papers in a single prompt. This model is specifically optimized for long-horizon agentic workflows where self-correction is mandatory.

The model introduces a unique 128,000 output token limit, allowing it to write full software modules or expansive technical documentation without truncation. It features a self-verification loop where it uses vision capabilities to check its own generated code, particularly for UI and 3D simulations. While it maintains strict safety filters for high-risk domains like biology, its general reasoning performance reaches senior-engineer levels, outperforming previous iterations in complex system architecture and large-scale migrations.

Developers primarily use Fable 5 for tasks that fail on standard models due to context fragmentation or lack of logic. It combines high-fidelity vision with senior-level software engineering capabilities, enabling it to build complex 3D environments and verify visual outputs against original designs. Technically, it represents a significant leap in multimodal logic and autonomous reliability.

Claude Fable 5

Use Cases

Discover the different ways you can use Claude Fable 5 to achieve great results.

Autonomous Codebase Migration

Migrate 50-million-line legacy repositories to modern frameworks by processing the entire project within the 1M context window.

3D Physics Simulation Generation

Create self-contained C++ or WebGL simulations with complex mesh colliders and fluid dynamics from a single prompt.

Senior Scientific Research Analysis

Synthesize hundreds of PhD-level research papers to identify novel hypotheses while adhering to safety guardrails.

Agentic Strategic Financial Modeling

Drive autonomous agents to process years of market data and generate detailed projections with interactive dashboards.

Real-time Network Visualization

Build backend systems that capture live packets and visualize them as 3D environments to identify security anomalies.

High-Fidelity Technical Content Creation

Generate 100,000-word technical manuals and comprehensive documentation sets in a single pass using expanded token limits.

Strengths

Limitations

Industry-Leading Logic: Dominates benchmarks with a 91.2% MMLU and 88.5% GPQA score, placing it at senior research scientist grade intelligence.
Premium Pricing: At $50 per 1 million output tokens, it is one of the most expensive models on the market, making it less suitable for simple chat tasks.
Massive Output Ceiling: The 128,000 output token limit allows for one-shotting entire applications and deep, multi-section technical reports.
Aggressive Safety Filters: The safeguards for cybersecurity and biology can occasionally trigger false positives on benign technical queries, forcing a fallback to Opus.
Autonomous Reliability: Scores 80.3% on SWE-Bench Pro, demonstrating a superior ability to resolve complex GitHub issues without human oversight.
High Reasoning Latency: Processing the full 1M context or using high-effort reasoning modes results in significantly longer response times compared to smaller models.
Advanced Vision Integration: Uses vision to check its own work, ensuring that generated UI and 3D assets align with the user's original design intent.
Data Retention Policy: Standard usage requires 30-day data retention for safety monitoring, which may not meet the requirements of highly sensitive environments.

API Quick Start

anthropic/claude-fable-5

View Documentation
anthropic SDK
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const message = await anthropic.messages.create({
  model: "claude-fable-5",
  max_tokens: 1024,
  messages: [{ 
    role: "user", 
    content: "Analyze this codebase for security vulnerabilities and suggest fixes." 
  }],
});

console.log(message.content[0].text);

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about Claude Fable 5

Qualitatively, this is a major-version-bump-deserving step change forward. It peaks for long problem-solving sessions on very difficult problems.
Andrej Karpathy
twitter
Fable 5 makes GPT 5.5 feel like a toy. For complex, difficult tasks, it is the new goalpost and the new state-of-the-art.
MattVidPro
youtube
Fable 5 just finished Pokemon Fire Red with Vision alone. Raw screenshots only, no map, no hidden state. That is quite impressive.
Charly Wargnier
twitter
The 1M context window finally makes large-scale legacy code migration feel like a solved problem. RAG feels optional for most of my projects now.
u/DevOps_Master
reddit
Claude 5 fable (extra high) made a Pokemon clone in 1 hour of reasoning with 8k lines in 1 shot. This is a new era.
Chris
twitter
Anthropic released Fable 5 for general availability and Claude Mythos 5 for restricted research. It is their most powerful model publicly available.
TechCrunch
news

Related Videos

Watch tutorials, reviews, and discussions about Claude Fable 5

I think it is very likely this is the most powerful language model we've ever had our hands on.

Look at the water. That is pretty crazy. This is quite possibly the best result I've had with this prompt.

It feels way more fully realized, well thought through. Absolutely mindblowing.

Qualitatively, this is a major-version-bump-deserving step change forward.

Fable 5 makes GPT 5.5 feel like a toy.

Fable 5 is state-of-the-art across basically every single benchmark on Swaybench Pro.

The vision capabilities of Fable 5 are so good that it beat Pokemon Fire Red using only raw vision.

They are finally back with Fable 5, and it is incredible, guys.

It handles long-running, complex, and asynchronous tasks with senior-level logic.

This model is optimized for autonomous knowledge work and coding.

This is a new paradigm so to speak. It put mesh colliders on the buildings and we can see in them.

This model might actually be able to make GTA 6. Actually, no. It would... look at that.

The attention to detail, like the spinning filament spool on the 3D printer, is just something else.

It recreated the 2011 game of the year in about an hour.

The massive output ceiling allows for one-shotting entire applications.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips

Expert tips to help you get the most out of Claude Fable 5 and achieve better results.

Use High-Effort Reasoning Modes

Toggle the model to 'High' or 'Extra High' effort in the API to solve math or logic problems that require deep chain-of-thought.

Leverage Prompt Caching

Use prompt caching for frequently accessed codebases to reduce costs by up to 90% during multi-day autonomous sessions.

Anchor Tasks with Vision

Provide screenshots of desired UIs to allow Fable 5 to use vision to verify its code matches your requirements.

Explicitly Request Self-Verification

Instruct the model to write its own test suite and execute it to identify bugs before returning the final result.

Utilize the 128K Output

Avoid breaking up long requests by asking for the entire backend and frontend in one prompt for architectural consistency.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

alibaba

Qwen3.5-397B-A17B

alibaba

Qwen3.5-397B-A17B is Alibaba's flagship open-weight MoE model. It features native multimodal reasoning, a 1M context window, and a 19x decoding throughput...

1M context
$0.40/$2.40/1M
openai

GPT-5.1

OpenAI

GPT-5.1 is OpenAI’s advanced reasoning flagship featuring adaptive thinking, native multimodality, and state-of-the-art performance in math and technical...

400K context
$1.25/$10.00/1M
moonshot

Kimi K2.5

Moonshot

Discover Moonshot AI's Kimi K2.5, a 1T-parameter open-source agentic model featuring native multimodal capabilities, a 262K context window, and SOTA reasoning.

256K context
$0.60/$3.00/1M
xai

Grok-4

xAI

Grok-4 by xAI is a frontier model featuring a 2M token context window, real-time X platform integration, and world-record reasoning capabilities.

2M context
$3.00/$15.00/1M
anthropic

Claude Sonnet 4.6

Anthropic

Claude Sonnet 4.6 offers frontier performance for coding and computer use with a massive 1M token context window for only $3/1M tokens.

1M context
$3.00/$15.00/1M
anthropic

Claude Opus 4.5

Anthropic

Claude Opus 4.5 is Anthropic's most powerful frontier model, delivering record-breaking 80.9% SWE-bench performance and advanced autonomous agency for coding.

200K context
$5.00/$25.00/1M
deepseek

DeepSeek v4

DeepSeek

DeepSeek v4 is a 1.6T parameter MoE model featuring a 1M token context window and native multimodal support for text, vision, and video at disruptive prices.

1M context
$1.74/$3.48/1M
google

Gemini 3.1 Flash-Lite

Google

Gemini 3.1 Flash-Lite is Google's fastest, most cost-efficient model. Features 1M context, native multimodality, and 363 tokens/sec speed for scale.

1M context
$0.25/$1.50/1M

Frequently Asked Questions

Find answers to common questions about Claude Fable 5