openai

GPT-5.2 Pro

GPT-5.2 Pro is OpenAI's 2025 flagship reasoning model featuring Extended Thinking for SOTA performance in mathematics, coding, and expert knowledge work.

openai logoopenaiGPT-5December 11, 2025
Context
400Ktokens
Max Output
128Ktokens
Input Price
$21.00/ 1M
Output Price
$168.00/ 1M
Modality:TextImage
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
93.2%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). GPT-5.2 Pro scored 93.2% on this benchmark.
HLE
36.6%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. GPT-5.2 Pro scored 36.6% on this benchmark.
MMLU
89.6%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. GPT-5.2 Pro scored 89.6% on this benchmark.
MMLU Pro
82%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. GPT-5.2 Pro scored 82% on this benchmark.
SimpleQA
52%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. GPT-5.2 Pro scored 52% on this benchmark.
IFEval
93.5%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. GPT-5.2 Pro scored 93.5% on this benchmark.
AIME 2025
100%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. GPT-5.2 Pro scored 100% on this benchmark.
MATH
97%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. GPT-5.2 Pro scored 97% on this benchmark.
GSM8k
99.2%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. GPT-5.2 Pro scored 99.2% on this benchmark.
MGSM
96%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. GPT-5.2 Pro scored 96% on this benchmark.
MathVista
76.5%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. GPT-5.2 Pro scored 76.5% on this benchmark.
SWE-Bench
80%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. GPT-5.2 Pro scored 80% on this benchmark.
HumanEval
94.5%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. GPT-5.2 Pro scored 94.5% on this benchmark.
LiveCodeBench
78%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. GPT-5.2 Pro scored 78% on this benchmark.
MMMU
79.5%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. GPT-5.2 Pro scored 79.5% on this benchmark.
MMMU Pro
79.5%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. GPT-5.2 Pro scored 79.5% on this benchmark.
ChartQA
91.2%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. GPT-5.2 Pro scored 91.2% on this benchmark.
DocVQA
94.8%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. GPT-5.2 Pro scored 94.8% on this benchmark.
Terminal-Bench
55.6%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. GPT-5.2 Pro scored 55.6% on this benchmark.
ARC-AGI
54.2%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. GPT-5.2 Pro scored 54.2% on this benchmark.

Try GPT-5.2 Pro Free

Chat with GPT-5.2 Pro for free. Test its capabilities, ask questions, and explore what this AI model can do.

Prompt
Response
openai/gpt-5.2-pro

Your AI response will appear here

About GPT-5.2 Pro

Learn about GPT-5.2 Pro's capabilities, features, and how it can help you achieve better results.

A New Frontier in Reasoning

GPT-5.2 Pro is OpenAI's state-of-the-art reasoning model designed specifically for high-stakes intellectual tasks. Released in late 2025, it introduces an 'extended thinking' mode that allows the model to process complex problems for extended durations to ensure logical consistency. It is widely considered the industry leader for professional mathematical proofs and advanced competitive programming, frequently solving problems that previous generations found impossible.

Technical Precision and Output

The model is characterized by its strict adherence to complex instructions and significantly lower hallucination rates in logical inference compared to its competitors. It maintains a highly organized and professional conversational tone, although it is noted for a 'colder' interaction style and increased latency due to its heavy reasoning overhead. It has become a staple for developers requiring mechanical codebase-wide checks and researchers requiring PhD-level precision across its massive 400,000 token context window.

Expert-Level Performance

Beyond benchmarks, GPT-5.2 Pro is the first model to consistently outperform human industry experts with over 14 years of experience on specialized work task benchmarks. Its ability to generate tens of thousands of lines of functional code in a single pass marks a significant shift away from the 'laziness' issues observed in earlier models, making it the primary choice for complex agentic workflows.

GPT-5.2 Pro

Use Cases for GPT-5.2 Pro

Discover the different ways you can use GPT-5.2 Pro to achieve great results.

Olympiad Mathematics

Excels at solving professional-level and IMO math problems with long-form proofs.

Mechanical Coding Tasks

Efficiently processes massive lists of mechanical code updates and checks without laziness.

Logical Inference

Performs deep reasoning for complex world-building and alternate history analysis.

Technical Research

Accurately retrieves and synthesizes niche technical data from specialized documentation.

Instruction Following

Strictly executes highly complex or counter-intuitive user requirements with extreme precision.

Creative Writing

Capable of producing high-density creative writing that mimics the texture of literary classics.

Strengths

Limitations

Mathematical SOTA: Currently the only model to achieve 100% on the AIME 2025 benchmark without external tools.
High Latency: The 'Extended Thinking' mode can take 30-40 minutes for a single complex response in some scenarios.
Zero-Laziness Coding: Capable of generating over 24,000 lines of functional code in a single response without truncation.
Cold Persona: Users describe the interaction style as sterile, clinical, and pretentious compared to more conversational models.
Expert Knowledge Parity: The first model to outperform industry experts with 14 years of experience on the GDP-Val tasks.
Premium Pricing: At $21/1M input tokens, it is significantly more expensive than many competitive models like Gemini 3 Pro.
Deep Reasoning Context: Maintains near-perfect retrieval and logic across its massive 400,000 token context window.
Implementation Gaps: Despite its intelligence, it can occasionally miss obscure library imports in complex 3D rendering scripts.

API Quick Start

openai/gpt-5.2-pro

View Documentation
openai SDK
import OpenAI from 'openai';

const openai = new OpenAI();

async function main() {
  const completion = await openai.chat.completions.create({
    model: 'gpt-5.2-pro',
    messages: [
      { role: 'user', content: 'Prove the existence of infinite primes using the extended thinking mode.' }
    ],
    reasoning_effort: 'high'
  });

  console.log(completion.choices[0].message.content);
}

main();

Install the SDK and start making API calls in minutes.

What People Are Saying About GPT-5.2 Pro

See what the community thinks about GPT-5.2 Pro

"GPT-5.2-codex xhigh is a beast that sweeps through your entire codebase and leaves nothing pending."
Rafael Bittencourt
x
"GPT Pro is absolutely SOTA in this area [Mathematics]. It can sometimes even solve the third and sixth problems."
ArchMeta1868
reddit
"GPT-5.2 Pro continues to blow me away... I got back a rigorous analysis in a professional Excel workbook."
Simon Smith
x
"This model is like a very intelligent, creative person who is unreliable but brilliant."
Narrator
youtube
"5.2's hallucinations are actually less than Opus, and it can very strictly execute my requirements."
ArchMeta1868
reddit
"The reasoning overhead is massive but the results for mathematical proofs are literally Nobel-tier."
QuantumDev
hackernews

Videos About GPT-5.2 Pro

Watch tutorials, reviews, and discussions about GPT-5.2 Pro

This is the first time in history that a human is outperformed in average... by an AGI.

GPT 5.2 thinking sets a new state-of-the-art score 70%... our first model that performs at or above human expert level.

It is a singular model that outperforms 44 real-world US occupations.

The internal reasoning trace is finally showing signs of genuine self-correction.

We are looking at a model that doesn't just predict text, it simulates logic.

Generating 24,000 lines of code in a single response is just unheard of.

There is now a selectable thinking time option here... allowing for an 'Extended Thinking' mode.

This model scored higher than all the other models on the Mensa Norway test... IQ 145 to 147.

The context window retrieval is essentially perfect even at 400k tokens.

It's not just more data, it's a completely different architecture for logical depth.

Beating human experts on GDP-Val over 50% of the time is a scary milestone for the labor market.

Everything just works... I'm really impressed by the coding abilities of GPT 5.2.

The canvas feature makes debugging 3JS code instantaneous.

OpenAI has finally solved the 'laziness' problem that plagued GPT-4.

This is the most 'professional' sounding AI I have ever interacted with.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows
Watch demo video

Pro Tips

Expert tips to help you get the most out of this model and achieve better results.

Extended Thinking

Use the 'extended thinking' mode for math or logic problems where accuracy is more critical than speed.

Codex Integration

Leverage its high performance in specialized environments like Codex for mechanical codebase management.

Verify Premises

If the first premise of a long response is wrong, interrupt and correct it immediately.

Iterative Refinement

If initial code fails, provide the console error back for a highly effective second-try fix.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

moonshot

Kimi K2 Thinking

moonshot

Kimi K2 Thinking is Moonshot AI's trillion-parameter reasoning model. It outperforms GPT-5 on HLE and supports 300 sequential tool calls autonomously for...

256K context
$0.15/1M
openai

GPT-5.2

openai

GPT-5.2 is OpenAI's flagship model for professional tasks, featuring a 400K context window, elite coding, and deep multi-step reasoning capabilities.

400K context
$1.75/$14.00/1M
google

Gemini 3 Pro

google

Google's Gemini 3 Pro is a multimodal powerhouse featuring a 1M token context window, native video processing, and industry-leading reasoning performance.

1M context
$2.00/$12.00/1M
google

Gemini 3 Flash

google

Gemini 3 Flash is Google's high-speed multimodal model featuring a 1M token context window, elite 90.4% GPQA reasoning, and autonomous browser automation tools.

1M context
$0.50/$3.00/1M
openai

GPT-5.1

openai

GPT-5.1 is OpenAI’s advanced reasoning flagship featuring adaptive thinking, native multimodality, and state-of-the-art performance in math and technical...

400K context
$1.25/$10.00/1M
xai

Grok-4

xai

Grok-4 by xAI is a frontier model featuring a 2M token context window, real-time X platform integration, and world-record reasoning capabilities.

2M context
$3.00/$15.00/1M
anthropic

Claude Opus 4.5

anthropic

Claude Opus 4.5 is Anthropic's most powerful frontier model, delivering record-breaking 80.9% SWE-bench performance and advanced autonomous agency for coding.

200K context
$5.00/$25.00/1M
zhipu

GLM-4.7

zhipu

GLM-4.7 by Zhipu AI is a flagship 358B MoE model featuring a 200K context window, elite 73.8% SWE-bench performance, and native Deep Thinking for agentic...

200K context
$0.60/$2.20/1M

Frequently Asked Questions

Find answers to common questions about this model