openai

GPT-5.2

GPT-5.2 is OpenAI's flagship model for professional tasks, featuring a 400K context window, elite coding, and deep multi-step reasoning capabilities.

openai logoopenaiGPT-5December 11, 2025
Context
400Ktokens
Max Output
100Ktokens
Input Price
$1.75/ 1M
Output Price
$14.00/ 1M
Modality:TextImage
Capabilities:VisionToolsStreamingReasoning
Benchmarks
GPQA
93%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). GPT-5.2 scored 93% on this benchmark.
HLE
45%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. GPT-5.2 scored 45% on this benchmark.
MMLU
88%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. GPT-5.2 scored 88% on this benchmark.
MMLU Pro
83%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. GPT-5.2 scored 83% on this benchmark.
SimpleQA
58%
SimpleQA: Factual Accuracy Benchmark. Tests a model's ability to provide accurate, factual responses to straightforward questions. Measures reliability and reduces hallucinations in knowledge retrieval tasks. GPT-5.2 scored 58% on this benchmark.
IFEval
95%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. GPT-5.2 scored 95% on this benchmark.
AIME 2025
100%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. GPT-5.2 scored 100% on this benchmark.
MATH
98%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. GPT-5.2 scored 98% on this benchmark.
GSM8k
99%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. GPT-5.2 scored 99% on this benchmark.
MGSM
98%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. GPT-5.2 scored 98% on this benchmark.
MathVista
78%
MathVista: Mathematical Visual Reasoning. Tests the ability to solve math problems that involve visual elements like charts, graphs, geometry diagrams, and scientific figures. Combines visual understanding with mathematical reasoning. GPT-5.2 scored 78% on this benchmark.
SWE-Bench
80%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. GPT-5.2 scored 80% on this benchmark.
HumanEval
95%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. GPT-5.2 scored 95% on this benchmark.
LiveCodeBench
80%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. GPT-5.2 scored 80% on this benchmark.
MMMU
75%
MMMU: Multimodal Understanding. Massive Multi-discipline Multimodal Understanding benchmark testing vision-language models on college-level problems across 30 subjects requiring both image understanding and expert knowledge. GPT-5.2 scored 75% on this benchmark.
MMMU Pro
65%
MMMU Pro: MMMU Professional Edition. Enhanced version of MMMU with more challenging questions and stricter evaluation. Tests advanced multimodal reasoning at professional and expert levels. GPT-5.2 scored 65% on this benchmark.
ChartQA
93%
ChartQA: Chart Question Answering. Tests the ability to understand and reason about information presented in charts and graphs. Requires extracting data, comparing values, and performing calculations from visual data representations. GPT-5.2 scored 93% on this benchmark.
DocVQA
95%
DocVQA: Document Visual Q&A. Document Visual Question Answering benchmark testing the ability to extract and reason about information from document images including forms, reports, and scanned text. GPT-5.2 scored 95% on this benchmark.
Terminal-Bench
60%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. GPT-5.2 scored 60% on this benchmark.
ARC-AGI
52.9%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. GPT-5.2 scored 52.9% on this benchmark.

Try GPT-5.2 Free

Chat with GPT-5.2 for free. Test its capabilities, ask questions, and explore what this AI model can do.

Prompt
Response
openai/gpt-5.2

Your AI response will appear here

About GPT-5.2

Learn about GPT-5.2's capabilities, features, and how it can help you achieve better results.

Elite Professional Reasoning

GPT-5.2 represents OpenAI's frontier in professional-grade artificial intelligence, specifically engineered for complex knowledge work and autonomous task execution. Released in late 2025, it introduces a dedicated Thinking mode that allows the model to pause and plan multi-step logic, making it exceptionally proficient at intricate software engineering, advanced mathematical proofs, and scientific analysis. This model architecture integrates multimodal vision and tool-calling into a unified reasoning engine, enabling it to act as an agentic partner in professional workflows.

Scalable Intelligence Architecture

Technically, GPT-5.2 features an industry-leading 400K context window with nearly 100% recall accuracy, allowing it to process massive codebases or dense technical manuals without losing information. While it excels in accuracy and reliability—reducing hallucinations by 30% compared to previous iterations—the model adopts a more formal, structured conversational tone. It is optimized for enterprise environments where consistency and precision are prioritized over creative flourishes, marking a shift toward AI as a dependable knowledge worker.

GPT-5.2

Use Cases for GPT-5.2

Discover the different ways you can use GPT-5.2 to achieve great results.

Autonomous Software Engineering

Resolving complex GitHub issues and managing large-scale codebase debugging with 80% accuracy.

Advanced Financial Research

Performing deep fundamental stock analysis and market trend synthesis using integrated agentic tools.

Multi-step Business Automation

Orchestrating complex workflows across connected productivity apps like Notion, Slack, and Google Drive.

Technical Document Synthesis

Processing and summarizing massive technical documents using its 400K token context window.

Scientific Math Reasoning

Solving PhD-level science and competition-level mathematics through specialized Thinking mode.

Professional Content Generation

Producing high-quality operatic-style prose and formatted professional reports at scale.

Strengths

Limitations

Elite Coding Proficiency: Its 80% score on SWE-bench Verified makes it one of the most capable models for professional software engineering.
Vision Latency Issues: Image perception and creation tasks are significantly slower than text-based reasoning due to high computational overhead.
State-of-the-Art Reasoning: The specialized Thinking variant provides deep logic for competition-level mathematics and PhD-level science.
Cold Conversational Tone: The model's interaction style is often described as formal and robotic, lacking the natural warmth of previous iterations.
Agentic Tool Use: Highly effective at using external tools like browsers and Python environments to manage multi-step professional workflows.
Premium Output Pricing: At $14 per million tokens in Thinking mode, output costs remain significantly higher than older, more agile models.
Large-Scale Context Recall: Supports up to 400K tokens with near-perfect accuracy, ideal for analyzing and synthesizing massive datasets.
Conversational Discontinuity: Its focus on organization can sometimes disrupt the flow of natural, synchronous discussions with users.

API Quick Start

openai/gpt-5.2

View Documentation
openai SDK
import OpenAI from 'openai';

const openai = new OpenAI();

async function main() {
  const completion = await openai.chat.completions.create({
    model: 'gpt-5.2-thinking',
    messages: [{ role: 'user', content: 'Analyze this recursive reflection problem in WebGL 2.' }],
    reasoning_effort: 'high'
  });

  console.log(completion.choices[0].message.content);
}

main();

Install the SDK and start making API calls in minutes.

What People Are Saying About GPT-5.2

See what the community thinks about GPT-5.2

"GPT-5.2's thinking mode is a game changer for complex coding tasks; it actually builds functional apps in one go."
AI_Dev
reddit
"It found Waldo in 13 minutes using raw pixel analysis. Terrifyingly smart but so slow for simple tasks."
VisualLearner
youtube
"The 400k context window recall is near perfect, finally a real competitor to Gemini's long-context dominance."
LogicGate
hackernews
"Creating images with GPT-5.2 still feels slower than molasses going uphill in January. Speed is its biggest enemy."
adventurepaul
reddit
"OpenAI's models focus so much on being organized now that it basically stops feeling like a conversation."
ArchMeta1868
reddit
"The ARC prize just verified a 390x efficiency improvement in one year from the o3 model to 5.2."
Fireship
x

Videos About GPT-5.2

Watch tutorials, reviews, and discussions about GPT-5.2

GPT 5.2 isn't just a better version of GPT-4. It's a completely different beast.

In some modes, you can feed it entire books, multiple research papers, and massive code bases at once.

On image-based reasoning tasks, the thinking mode achieves around 89% accuracy on really challenging benchmarks.

The reasoning effort parameter is the key to unlocking this model's true logic potential.

Wait until you see how it handles the prompt caching for recurring developer tasks.

The model correctly identified that bees enter through a single entrance rather than dispersing randomly.

The model spent 19 seconds 'thinking' to generate a functional Photoshop clone with layers and blending modes.

GPT 5.2 successfully implemented recursive ray tracing for reflecting spheres in WebGL 2.

It's the first time I've seen an AI maintain state across such a massive logic chain.

Even with complex UI layouts, the vision module never lost track of the primary CTA.

OpenAI just dropped their answer to Gemini: GPT 5.2, a model that once again moves the AI hype wheel back in favor of OpenAI.

The real flex though is its rise to the top of the ARC AGI benchmark.

The ARC prize just verified a 390x efficiency improvement in one year from the o3 model to 5.2.

If you thought previous coding agents were good, this thing is on another level of autonomy.

Ship it, just ship the model because it's solving GitHub issues while we're sleeping.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows
Watch demo video

Pro Tips

Expert tips to help you get the most out of this model and achieve better results.

Enable Thinking Mode for Logic

Explicitly switch to the gpt-5.2-thinking variant when solving high-complexity math or coding problems for maximum accuracy.

Leverage Prompt Caching

Take advantage of 24-hour prompt caching to reduce latency and costs when working with large, recurring datasets.

Utilize Model Context Protocol

Connect the model to your workspace tools to enable real-world task execution like scheduling and emailing.

Step-by-Step Prompting

Ask the model to show its reasoning process to help audit decision-making during extremely long-context reasoning tasks.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.