anthropic

Claude Opus 4.7

Claude Opus 4.7 is Anthropic's flagship model with a 1-million-token context, adaptive reasoning, and 3.3x vision resolution for enterprise-scale agents.

Frontier ModelAgentic AICoding AssistantLarge ContextAnthropic
anthropic logoanthropicClaudeApril 16, 2026
Context
1.0Mtokens
Max Output
128Ktokens
Input Price
$5.00/ 1M
Output Price
$25.00/ 1M
Modality:TextImage
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
94.2%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). Claude Opus 4.7 scored 94.2% on this benchmark.
HLE
54.7%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. Claude Opus 4.7 scored 54.7% on this benchmark.
MMLU
89.8%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. Claude Opus 4.7 scored 89.8% on this benchmark.
MMLU Pro
89.9%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. Claude Opus 4.7 scored 89.9% on this benchmark.
SimpleQA
31.6%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. Claude Opus 4.7 scored 31.6% on this benchmark.
IFEval
91.2%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. Claude Opus 4.7 scored 91.2% on this benchmark.
AIME 2025
100%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. Claude Opus 4.7 scored 100% on this benchmark.
MATH
94.1%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. Claude Opus 4.7 scored 94.1% on this benchmark.
GSM8k
98.4%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. Claude Opus 4.7 scored 98.4% on this benchmark.
MGSM
94.1%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. Claude Opus 4.7 scored 94.1% on this benchmark.
MathVista
78%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. Claude Opus 4.7 scored 78% on this benchmark.
SWE-Bench
87.6%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. Claude Opus 4.7 scored 87.6% on this benchmark.
HumanEval
92.4%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. Claude Opus 4.7 scored 92.4% on this benchmark.
LiveCodeBench
78.5%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. Claude Opus 4.7 scored 78.5% on this benchmark.
MMMU
80.7%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. Claude Opus 4.7 scored 80.7% on this benchmark.
MMMU Pro
85.6%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. Claude Opus 4.7 scored 85.6% on this benchmark.
ChartQA
79.5%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. Claude Opus 4.7 scored 79.5% on this benchmark.
DocVQA
92.5%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. Claude Opus 4.7 scored 92.5% on this benchmark.
Terminal-Bench
59.3%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. Claude Opus 4.7 scored 59.3% on this benchmark.
ARC-AGI
68.8%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. Claude Opus 4.7 scored 68.8% on this benchmark.

About Claude Opus 4.7

Learn about Claude Opus 4.7's capabilities, features, and how it can help you achieve better results.

Model Overview

Claude Opus 4.7 is the flagship model in the Claude 4 architecture series. It uses an Adaptive Thinking framework that allows the model to scale its cognitive effort based on the perceived difficulty of a task. This replaces fixed reasoning budgets with dynamic logic levels. Developers can now control internal reasoning depth through an API effort parameter, allowing for a better balance between latency and logical rigor. The model is specifically tuned for high-stakes enterprise workflows and autonomous agentic loops.

Context and Multimodal Capabilities

This model provides a 1-million-token context window without a long-context pricing premium. It includes a 128,000-token output limit, enabling the generation of massive technical documents or complete code repositories in one response. The vision resolution is 3.3x higher than previous iterations. This allows for pixel-perfect UI understanding and 1:1 coordinate mapping in images up to 2576 pixels. These improvements make it a reliable choice for document analysis and visual auditing tasks.

Agentic Engineering and Safety

Architectural updates target long-horizon tasks and software engineering. It scores 87.6% on the SWE-bench Verified leaderboard, currently leading in its ability to resolve real GitHub issues. The model introduces task budgets to help manage token consumption across multi-turn agent sessions. Anthropic has integrated real-time cybersecurity safeguards into the core architecture to prevent the model from participating in malicious exploits while maintaining utility for security researchers.

Claude Opus 4.7

Use Cases

Discover the different ways you can use Claude Opus 4.7 to achieve great results.

Agentic Software Engineering

Utilizing high effort levels to autonomously refactor repositories and resolve complex cross-file dependencies.

Large-Scale Repository Synthesis

Processing 1 million tokens of source code to map architectural flows and generate technical documentation.

High-Resolution Vision Analysis

Analyzing dense charts and pixel-level UI screenshots with 3.3x more detail than previous frontier models.

Cybersecurity Vulnerability Research

Performing deep security audits and zero-day analysis within verified safety boundaries.

Enterprise Knowledge Extraction

Extracting structured data from massive technical libraries and performing complex cross-document redlining.

Interactive 3D Prototyping

Generating functional 3D environments and game logic from natural language descriptions.

Strengths

Limitations

Industry-Leading Coding Precision: Achieves 87.6% on SWE-bench Verified, outperforming all other generally available models for software engineering.
Higher Token Consumption: A new tokenizer results in approximately 35% higher token usage for the same text compared to previous Claude versions.
Massive Context Stability: Maintains 100% accuracy in the 1M token context window without charging a long-context premium.
Fixed Sampling Parameters: The removal of temperature and top-p controls limits creative flexibility for non-deterministic use cases.
Superior Visual Acuity: Supports images up to 2576px, enabling 1:1 pixel mapping for precise document and UI analysis.
High Latency in Max Effort: Generating responses with 'xhigh' effort levels leads to significant wait times for complex tasks.
Dynamic Reasoning Control: Allows developers to toggle effort levels via the adaptive thinking framework for custom latency-logic balance.
Aggressive Safety Refusals: Real-time cybersecurity filters can lead to false positive refusals for legitimate security research.

API Quick Start

anthropic/claude-opus-4-7

View Documentation
anthropic SDK
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const msg = await anthropic.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 4096,
  thinking: { type: "adaptive" },
  messages: [{ role: "user", content: "Analyze this architecture for concurrency bugs." }],
});

console.log(msg.content[0].text);

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about Claude Opus 4.7

Claude Opus 4.7 leads on SWE-bench and agentic reasoning, beating GPT-5.4 and Gemini 3.1 Pro.
zarfet
twitter
The fact it can generate a procedural 3D skate game in one go is evidence of the model's logic density.
jrandolph
hackernews
Opus 4.7 just dropped. cursorbench jumped from 58% to 70%. XBOW visual acuity 98.5% vs 54.5% on opus 4.6.
hirenthakore
twitter
Claude tends to over-engineer: you ask for a simple function and get an architecture designed to scale for the next decade.
Ok_Today5649
reddit
Early feedback on Claude Opus 4.7 points to higher token usage and stricter prompting requirements.
kimmonismus
twitter
The X-High reasoning effort is the missing middle ground we needed for complex agentic workflows.
Bijan Bowen
youtube

Related Videos

Watch tutorials, reviews, and discussions about Claude Opus 4.7

Claude has been and is still the best quoting model available today.

It's actually the same price as it was before, but they gave you more control over its reasoning.

This is working perfectly right. It picked the tools I would have picked myself.

The model feels noticeably faster when you don't use the highest thinking levels.

You can see it thinking about the edge cases before it even writes a single line of code.

This model is way more expensive to run... you're going to be paying 35% more for Opus 4.7.

The vision upgrade alone is worth it... it can take images three times the resolution without cropping.

If you use the API, you can expect to pay 35% more than before.

The tokenization change is the silent killer for your API bills if you aren't careful.

It handles deep context much better than the earlier version of Opus 4.

The vision capabilities of this model are substantially better.

The X-High reasoning effort is the missing middle ground we needed for complex agentic workflows.

This absolutely 100% warrants an insane title. This seriously blew me away.

It correctly identified a bug in my legacy codebase that three other models missed.

The level of autonomy in the agent loops is what differentiates this from GPT-5.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips

Expert tips to help you get the most out of Claude Opus 4.7 and achieve better results.

Activate Adaptive Thinking

Explicitly enable the adaptive thinking mode in API calls to ensure Claude selects the optimal reasoning depth.

Use X-High for Agents

Set the effort parameter to xhigh for agentic loops to maximize self-verification and logical precision.

Remove Scaffolding

Remove legacy prompts like double-check your work as the model is optimized for internal self-correction.

Monitor Token Consumption

Use the new tokenizer tracking to manage the 35% increase in token counts for identical text inputs.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

google

Gemini 3.1 Pro

Google

Gemini 3.1 Pro is Google's elite multimodal model featuring the DeepThink reasoning engine, a 1M+ context window, and industry-leading ARC-AGI logic scores.

1M context
$2.00/$12.00/1M
google

Gemini 3.1 Flash Live Preview

Google

Gemini 3.1 Flash Live Preview is Google's ultra-low-latency, audio-to-audio model featuring a 131K context window, high-fidelity multimodal reasoning, and...

131K context
$0.75/$4.50/1M
xai

Grok-3

xAI

Grok-3 is xAI's flagship reasoning model, featuring deep logic deduction, a 128k context window, and real-time integration with X for live research and coding.

1M context
$3.00/$15.00/1M
openai

GPT-5.2 Pro

OpenAI

GPT-5.2 Pro is OpenAI's 2025 flagship reasoning model featuring Extended Thinking for SOTA performance in mathematics, coding, and expert knowledge work.

400K context
$21.00/$168.00/1M
google

Gemini 3 Pro

Google

Google's Gemini 3 Pro is a multimodal powerhouse featuring a 1M token context window, native video processing, and industry-leading reasoning performance.

1M context
$2.00/$12.00/1M
anthropic

Claude Opus 4.6

Anthropic

Claude Opus 4.6 is Anthropic's flagship model featuring a 1M token context window, Adaptive Thinking, and world-class coding and reasoning performance.

1M context
$5.00/$25.00/1M
google

Gemini 3 Flash

Google

Gemini 3 Flash is Google's high-speed multimodal model featuring a 1M token context window, elite 90.4% GPQA reasoning, and autonomous browser automation tools.

1M context
$0.50/$3.00/1M
anthropic

Claude Sonnet 4.6

Anthropic

Claude Sonnet 4.6 offers frontier performance for coding and computer use with a massive 1M token context window for only $3/1M tokens.

1M context
$3.00/$15.00/1M

Frequently Asked Questions

Find answers to common questions about Claude Opus 4.7