zhipu

GLM-5.2

GLM-5.2 is Zhipu AI's flagship open-weight model featuring a 1M context window and specialized agentic coding capabilities under an MIT license.

Open WeightsMIT LicenseCoding Assistant1M ContextReasoning
zhipu logozhipuGLM-5June 16, 2026
Context
1.0Mtokens
Max Output
4Ktokens
Input Price
$1.40/ 1M
Output Price
$4.40/ 1M
Modality:Text
Capabilities:ToolsStreamingReasoning
Benchmarks
GPQA
83%
GPQA: Graduate-Level Science Q&A. A rigorous benchmark with 448 multiple-choice questions in biology, physics, and chemistry created by domain experts. PhD experts only achieve 65-74% accuracy, while non-experts score just 34% even with unlimited web access (hence 'Google-proof'). GLM-5.2 scored 83% on this benchmark.
HLE
40%
HLE: High-Level Expertise Reasoning. Tests a model's ability to demonstrate expert-level reasoning across specialized domains. Evaluates deep understanding of complex topics that require professional-level knowledge. GLM-5.2 scored 40% on this benchmark.
MMLU
94%
MMLU: Massive Multitask Language Understanding. A comprehensive benchmark with 16,000 multiple-choice questions across 57 academic subjects including math, philosophy, law, and medicine. Tests broad knowledge and reasoning capabilities. GLM-5.2 scored 94% on this benchmark.
MMLU Pro
86%
MMLU Pro: MMLU Professional Edition. An enhanced version of MMLU with 12,032 questions using a harder 10-option multiple choice format. Covers Math, Physics, Chemistry, Law, Engineering, Economics, Health, Psychology, Business, Biology, Philosophy, and Computer Science. GLM-5.2 scored 86% on this benchmark.
IFEval
85%
IFEval: Instruction Following Evaluation. Measures how well a model follows specific instructions and constraints. Tests the ability to adhere to formatting rules, length limits, and other explicit requirements. GLM-5.2 scored 85% on this benchmark.
AIME 2025
99%
AIME 2025: American Invitational Math Exam. Competition-level mathematics problems from the prestigious AIME exam designed for talented high school students. Tests advanced mathematical problem-solving requiring abstract reasoning, not just pattern matching. GLM-5.2 scored 99% on this benchmark.
MATH
97%
MATH: Mathematical Problem Solving. A comprehensive math benchmark testing problem-solving across algebra, geometry, calculus, and other mathematical domains. Requires multi-step reasoning and formal mathematical knowledge. GLM-5.2 scored 97% on this benchmark.
GSM8k
98%
GSM8k: Grade School Math 8K. 8,500 grade school-level math word problems requiring multi-step reasoning. Tests basic arithmetic and logical thinking through real-world scenarios like shopping or time calculations. GLM-5.2 scored 98% on this benchmark.
MGSM
91%
MGSM: Multilingual Grade School Math. The GSM8k benchmark translated into 10 languages including Spanish, French, German, Russian, Chinese, and Japanese. Tests mathematical reasoning across different languages. GLM-5.2 scored 91% on this benchmark.
SWE-Bench
62%
SWE-Bench: Software Engineering Benchmark. AI models attempt to resolve real GitHub issues in open-source Python projects with human verification. Tests practical software engineering skills on production codebases. Top models went from 4.4% in 2023 to over 70% in 2024. GLM-5.2 scored 62% on this benchmark.
HumanEval
97%
HumanEval: Python Programming Problems. 164 hand-written programming problems where models must generate correct Python function implementations. Each solution is verified against unit tests. Top models now achieve 90%+ accuracy. GLM-5.2 scored 97% on this benchmark.
LiveCodeBench
65%
LiveCodeBench: Live Coding Benchmark. Tests coding abilities on continuously updated, real-world programming challenges. Unlike static benchmarks, uses fresh problems to prevent data contamination and measure true coding skills. GLM-5.2 scored 65% on this benchmark.
Terminal-Bench
81%
Terminal-Bench: Terminal/CLI Tasks. Tests the ability to perform command-line operations, write shell scripts, and navigate terminal environments. Measures practical system administration and development workflow skills. GLM-5.2 scored 81% on this benchmark.
ARC-AGI
14%
ARC-AGI: Abstraction & Reasoning. Abstraction and Reasoning Corpus for AGI - tests fluid intelligence through novel pattern recognition puzzles. Each task requires discovering the underlying rule from examples, measuring general reasoning ability rather than memorization. GLM-5.2 scored 14% on this benchmark.

About GLM-5.2

Learn about GLM-5.2's capabilities, features, and how it can help you achieve better results.

Mixture of Experts Architecture

GLM-5.2 is a Mixture of Experts (MoE) flagship model designed for long horizon tasks and autonomous agentic workflows. It utilizes a massive 753 billion parameter architecture with approximately 40 billion active parameters per token. This design represents a significant leap in efficiency for the GLM series by reducing compute costs while maintaining performance for complex logical tasks.

IndexShare Efficiency

The model introduces IndexShare, a novel architectural improvement that reuses indexers across sparse attention layers. This innovation reduces per token floating point operations by 2.9 times at the full 1 million token context length. This efficiency makes the massive context window actually usable for large scale projects rather than just a theoretical limit.

Specialized Agentic Training

What distinguishes GLM-5.2 from alternatives is its focus on long horizon coding trajectories. It was specifically trained on complex debugging and implementation tasks across entire codebases. Developers can toggle between High and Max thinking effort levels, allowing the model to spend more compute on internal reasoning for systems optimization and advanced mathematical problem solving.

GLM-5.2

Use Cases

Discover the different ways you can use GLM-5.2 to achieve great results.

Agentic Software Engineering

Deploy the model within autonomous frameworks to handle development tasks from requirements gathering to final deployment.

Large Scale Code Refactoring

Analyze and rewrite multi-file software projects by loading the entire codebase into the 1M token context window.

Automated Document Review

Process massive legal or technical documentation sets to identify inconsistencies or extract structured data with high reasoning accuracy.

3D Scene Generation

Utilize the specialized strength in WebGL and HTML5 to generate complex interactive 3D visualizations from text prompts.

Business Logic Automation

Plug the model into agent operating systems to manage shared memory and execute scheduled multi-hour workflows without oversight.

Local Privacy First Development

Run the open weight model on private hardware clusters to ensure full data sovereignty for sensitive corporate engineering projects.

Strengths

Limitations

Exceptional Coding Intelligence: The model ranks #3 on FrontierSWE with a 74.4% score, proving its capability for multi-hour engineering projects.
High Token Verbosity: The model tends to generate roughly 2 times more tokens than its predecessor to achieve results, increasing latency.
Disruptive Price/Performance: At $1.40/$4.40 per million tokens, it offers frontier level intelligence at roughly 1/6th the cost of proprietary competitors.
Massive Hardware Requirements: With a 753B parameter footprint, local deployment is out of reach for most individual developers without significant quantization.
Truly Usable 1M Context: It is optimized for long horizon messy coding trajectories where previous models often failed to maintain coherence.
Slower Wall-Clock Response: Response times can be up to 3 times longer than Western models due to the extended internal reasoning cycles.
Full Sovereignty and Privacy: The MIT licensed open weights allow developers to run the model locally, avoiding external API risks and data leaks.
Design Creativity Plateaus: While technically proficient in frontend coding, it can be less creative in aesthetic design than Claude Opus.

API Quick Start

zhipu/glm-5.2

View Documentation
zhipu SDK
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_Z_AI_API_KEY',
  baseURL: 'https://api.z.ai/api/paas/v4/',
});

async function main() {
  const completion = await client.chat.completions.create({
    model: 'glm-5.2',
    messages: [{ role: 'user', content: 'Design a WebGL 3D city scene.' }],
    // @ts-ignore - specialized Z.ai parameter
    thinking: { type: 'enabled' },
    reasoning_effort: 'max',
  });

  console.log(completion.choices[0].message.content);
}

main();

Install the SDK and start making API calls in minutes.

Community Feedback

See what the community thinks about GLM-5.2

I've been saying for months that open source AI models are 6 months behind frontier. They caught up. GLM 5.2 is as good as Opus 4.8.
Alex Finn
twitter
The jump between 5.1 and 5.2 is pretty huge... it really likes long chains of thought here and is beating out proprietary models.
Sam Witteveen
youtube
The 2-bit model retains ~82% accuracy after we shrunk it from 1.51TB to 238GB. GLM-5.2 is the strongest open model to date.
Unsloth AI
twitter
It leads open-weight models and has claimed the top spot on Design Arena, surpassing the now-unavailable Claude Fable 5.
Brian Roemmele
twitter
The 1 million token context window is lossless, which is impressive for an open weight model.
DevGuru
reddit
Benchmark numbers are one thing, but in actual agent workflows, it feels very robust.
TechInnovator
hackernews

Related Videos

Watch tutorials, reviews, and discussions about GLM-5.2

The jump between 5.1 and 5.2 is pretty huge... it really likes long chains of thought here.

I really don't see the point in using models like Sonnet or Gemini Flash if this thing can replace it for much cheaper.

The 1 million token context window is lossless, which is impressive for an open weight model.

It’s clearly targeted at developers who need local control over their reasoning engines.

Benchmark numbers are one thing, but in actual agent workflows, it feels very robust.

It's the first open-weight model to get over 80 in Terminal Bench and is up there with GPT 5.5.

You went from 15,000 tokens to 30,000. This is token abuse... you're going to be waiting twice as long.

Local testing shows it handles complex file structures better than DeepSeek v4.

The reasoning effort Max really pushes the hardware, but the logic is sound.

MIT license means you can use this for basically anything without worrying about terms.

I've seen some crazy benchmarks scoring higher than Fable on design bench and it's getting buzz.

I asked GLM 5.2 to redesign this app... no failed edits. Really quite clean to be honest.

The frontend capabilities are a major standout for this version.

It feels more like a tool for building other tools than just a chatbot.

The ability to inspect thinking tokens is a developer's dream for debugging logic.

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents
Web Automation
Smart Workflows

Pro Tips

Expert tips to help you get the most out of GLM-5.2 and achieve better results.

Enable Max Reasoning for Logic

Activate the Max reasoning effort for complex coding or math tasks where accuracy is more critical than generation speed.

Load Entire Projects

Use the 1M context window to provide the model with entire project documentation and style guides to ensure consistent code output.

Optimize with Quantization

Utilize FP8 or 2-bit quantization for local deployments to fit the massive 753B parameter footprint onto high end hardware.

Inspect Thinking Tokens

Leverage native support for thinking tokens to inspect internal logic before the final answer to catch potential errors early.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

alibaba

Qwen3.5-Omni

alibaba

Qwen3.5-Omni is a natively omnimodal AI by Alibaba Cloud, offering seamless audio-visual reasoning, real-time voice chat, and 256k context for low-latency apps.

256K context
$0.40/$4.80/1M
openai

GPT-5.4

OpenAI

GPT-5.4 is OpenAI's frontier model featuring a 1.05M context window and Extreme Reasoning. It excels at autonomous UI interaction and long-form data analysis.

1M context
$2.50/$15.00/1M
moonshot

Kimi K2 Thinking

Moonshot

Kimi K2 Thinking is Moonshot AI's trillion-parameter reasoning model. It outperforms GPT-5 on HLE and supports 300 sequential tool calls autonomously for...

256K context
$0.60/$2.50/1M
openai

GPT-5.3 Codex

OpenAI

GPT-5.3 Codex is OpenAI's 2026 frontier coding agent, featuring a 400K context window, 77.3% Terminal-Bench score, and superior logic for complex software...

400K context
$1.75/$14.00/1M
openai

GPT-5.2

OpenAI

GPT-5.2 is OpenAI's flagship model for professional tasks, featuring a 400K context window, elite coding, and deep multi-step reasoning capabilities.

400K context
$1.75/$14.00/1M
alibaba

Qwen3.6-Max-Preview

alibaba

Qwen3.6-Max-Preview is Alibaba's flagship MoE model featuring 1M context, a native thinking mode, and SOTA scores in agentic coding and reasoning.

1M context
$1.25/$10.00/1M
zhipu

GLM-5

Zhipu (GLM)

GLM-5 is Zhipu AI's 744B parameter open-weight powerhouse, excelling in long-horizon agentic tasks, coding, and factual accuracy with a 200k context window.

200K context
$1.00/$3.20/1M
zhipu

GLM-5.1

Zhipu (GLM)

GLM-5.1 is Zhipu AI's flagship reasoning model, featuring a 202K context window and an autonomous 8-hour execution loop for complex agentic engineering.

203K context
$1.40/$4.40/1M

Frequently Asked Questions

Find answers to common questions about GLM-5.2