What is the context window of GLM-5?

GLM-5 supports a context window of up to 200,000 tokens, which is industry-leading for open-weight models.

How much does the GLM-5 API cost?

The pricing is highly competitive at $1.00 per 1 million input tokens and $3.20 per 1 million output tokens.

Does GLM-5 support vision or multimodal inputs?

No, GLM-5 is a text-only model. For vision tasks, it is typically orchestrated with specialized multimodal models like GLM-4.5V.

What company created the GLM models?

GLM models are developed by Zhipu AI (also known as Z.AI), a leading AI research lab based in China.

How does GLM-5 compare to Claude 4.5 in coding?

GLM-5 achieves a state-of-the-art 77.8% on SWE-Bench Verified, rivaling the performance of frontier proprietary models at a fraction of the cost.

Is GLM-5 open source?

Yes, GLM-5 is an open-weights model released under the permissive MIT license, allowing for broad commercial use.

What is the parameter size of GLM-5?

GLM-5 features 744 billion total parameters, with 40 billion active parameters per token in its MoE architecture.

What makes GLM-5 unique compared to other open-weight models?

It is specifically optimized for 'Agentic Engineering' and long-horizon tasks, meaning it can maintain logic across longer execution sessions than its peers.

GLM-5

GLM-5 is Zhipu AI's 744B parameter open-weight powerhouse, excelling in long-horizon agentic tasks, coding, and factual accuracy with a 200k context window.

Open WeightsAgentic EngineeringMoEZhipu AICoding AI

zhipuGLMFebruary 11, 2026

Context

200Ktokens

Max Output

128Ktokens

Input Price

$1.00/ 1M

Output Price

$3.20/ 1M

Modality:Text

Capabilities:ToolsStreamingReasoning

Benchmarks

GPQA

68.2%

HLE

32%

MMLU

85%

MMLU Pro

70.4%

SimpleQA

48%

IFEval

88%

AIME 2025

84%

MATH

88%

GSM8k

97%

MGSM

90%

MathVista

SWE-Bench

77.8%

HumanEval

90%

LiveCodeBench

52%

MMMU

MMMU Pro

ChartQA

DocVQA

Terminal-Bench

56.2%

ARC-AGI

12%

View API Documentation

About GLM-5

Learn about GLM-5's capabilities, features, and how it can help you achieve better results.

GLM-5 is Zhipu AI's next-generation flagship foundation model, specifically engineered to redefine the state of Agentic Engineering for open-weight systems. Built on a massive 744 billion parameter Mixture of Experts (MoE) architecture with 40 billion active parameters, it is the first open-weights model to bridge the performance gap with proprietary giants like Claude 4.5. This model excels in logic density and software engineering, achieving a breakthrough 77.8% on SWE-Bench Verified.

Technically, GLM-5 integrates advanced Multi-head Latent Attention (MLA) and Sparse Attention mechanisms to optimize token efficiency and reduce memory overhead by 33%. Trained on a scale of 28.5 trillion tokens using a purely domestic cluster of 100,000 Huawei Ascend chips, GLM-5 demonstrates that frontier-level reasoning is possible without dependency on high-end NVIDIA hardware. With its 200,000 token context window and specialized 'Thinking Mode,' it provides robust, low-hallucination outputs for high-precision technical workflows.

Optimized for reliability, GLM-5 serves as a foundation for autonomous technical agents capable of maintaining persistent state across long-horizon executions. Its permissive MIT licensing and competitive pricing of $1.00 per million input tokens make it an ideal choice for enterprises seeking local deployment or high-scale API integration without the restrictive terms of proprietary alternatives.

Use Cases for GLM-5

Discover the different ways you can use GLM-5 to achieve great results.

Complex Systems Engineering

Designing and maintaining microservice architectures with autonomous dependency management.

Long-Horizon Agentic Tasks

Executing multi-step technical workflows that require persistent memory for over an hour of execution.

Legacy Codebase Migration

Refactoring entire repositories and updating outdated dependencies across a 200k token window.

Low-Hallucination Technical Research

Conducting high-precision technical research where factual accuracy and abstention are paramount.

Autonomous Terminal Operations

Powering dev-agents that can autonomously run security audits and system administration commands.

Bilingual Global Deployment

Providing top-tier English and Chinese reasoning for localized enterprise applications at scale.

Strengths

Limitations

Elite Agentic Intelligence: Achieves the highest Agentic Index score (63) among open-weight models for multi-step task execution.

No Native Multimodality: Lacks the vision, audio, and video processing capabilities found in multimodal competitors like GPT-4o.

Low Hallucination Rate: Exhibits a 56% reduction in hallucinations compared to previous generations, prioritizing factual accuracy.

Extreme Hosting Requirements: The 1.5TB BF16 weights make local deployment impossible for almost all users without cloud infrastructure.

Massive MoE Efficiency: The 744B parameter architecture provides flagship logic density while MLA reduces memory overhead by 33%.

High Inference Latency: Initial time-to-first-token can be high (over 7 seconds) on public APIs compared to smaller 'flash' models.

Permissive MIT License: Released under a true open-source license, allowing for unrestricted commercial use without restrictive user-carve outs.

Frontend Design Nuance: While excellent at logic, it can occasionally struggle with fine-grained CSS aesthetic polishing compared to Claude.

API Quick Start

zai/glm-5

View Documentation

zhipu SDK

import { ZhipuAI } from "zhipuai-sdk";

const client = new ZhipuAI({ apiKey: "YOUR_API_KEY" });

async function main() {
  const response = await client.chat.completions.create({
    model: "glm-5",
    messages: [{ role: "user", content: "Analyze this repo for security vulnerabilities." }],
    stream: true,
  });

  for await (const chunk of response) {
    process.stdout.write(chunk.choices[0].delta.content || "");
  }
}

main();

Install the SDK and start making API calls in minutes.

What People Are Saying About GLM-5

See what the community thinks about GLM-5

“GLM-5 is the new open weights leader! It scores 50 on the Intelligence Index, a significant closing of the gap.”

— Artificial Analysis

“This model is unbelievable. I successfully ran a job that took over an hour... blew me away.”

— Theo - t3.gg

youtube

“GLM-5 used zero NVIDIA chips, 745B params, and costs $1 per million input tokens. This is the future.”

— Legendary

“The hallucination rate is insane; it's much more willing to say 'I don't know' than lie to you.”

— DevUser456

“Zhipu AI just dropped the gauntlet for open source coding models.”

— AIExplorer

hackernews

“Finally, an open weight model that doesn't lose its mind halfway through a complex task.”

— CodeMaster

Videos About GLM-5

Watch tutorials, reviews, and discussions about GLM-5

“It is by far the best openweight model I have seen, especially for code stuff.”

“The fact this is the first openweight model that I've successfully run a job that took over an hour... blew me away.”

“It appears to be the model that hallucinates the least of any model to date.”

“We are seeing a massive shift in what open weight models can actually do in production.”

“The stability of this model during long tool-use sessions is genuinely unprecedented.”

“The coding feel here is very, very potent... comparable to GLM 4.7 which was already a unicorn.”

“The introduction of the dynamic island in its UI mockup was a very cool, unexpected special feature.”

“It's outperforming almost every other model in its class for complex logic.”

“The reasoning depth here reminds me of the first time I used o1, but it's open weight.”

“For a text-only model, it handles visual logic in code better than many vision models.”

“Memory usage has shot down... we got 33x memory improvements compared to what we were doing previously.”

“It passed the car wash logic test with thinking enabled, beating out Claude and GPT-4o.”

“Deploying this requires a serious server rack, but the performance per watt is insane.”

“It handled my legacy repo migration without a single hallucinated library name.”

“The thinking mode isn't just a gimmick; it fundamentally changes the output quality.”

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents

Web Automation

Smart Workflows

Get Started Free

Pro Tips for GLM-5

Expert tips to help you get the most out of GLM-5 and achieve better results.

Enable Thinking Mode

GLM-5 performs significantly better on complex logic puzzles like the 'car wash' test when reasoning is enabled.

Leverage the MIT License

Take advantage of the permissive licensing for unrestricted commercial development and internal hosting.

Tool Use Optimization

Use GLM-5 for multi-step tasks as it is specifically purpose-built for high stability in agentic tool execution.

Context Window Utilization

Input entire codebases into the 200k window to perform repository-wide security audits or refactoring.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Mohammed Ibrahim

CEO, qannas.pro

Ben Bressington

CTO, AiChatSolutions

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

MiniMax M2.5

minimax

MiniMax M2.5 is a SOTA MoE model featuring a 1M context window and elite agentic coding capabilities at disruptive pricing for autonomous agents.

1M context

$0.30/$1.20/1M

GPT-5.3 Codex

OpenAI

GPT-5.3 Codex is OpenAI's 2026 frontier coding agent, featuring a 400K context window, 77.3% Terminal-Bench score, and superior logic for complex software...

400K context

$1.75/$14.00/1M

Qwen3-Coder-Next

alibaba

Qwen3-Coder-Next is Alibaba Cloud's elite Apache 2.0 coding model, featuring an 80B MoE architecture and 256k context window for advanced local development.

256K context

$0.14/$0.42/1M

Claude Sonnet 4.5

Anthropic

Anthropic's Claude Sonnet 4.5 delivers world-leading coding (77.2% SWE-bench) and a 200K context window, optimized for the next generation of autonomous agents.

200K context

$3.00/$15.00/1M

GPT-5.4

OpenAI

GPT-5.4 is OpenAI's frontier model featuring a 1.05M context window and Extreme Reasoning. It excels at autonomous UI interaction and long-form data analysis.

1M context

$2.50/$15.00/1M

Gemini 3.1 Flash-Lite

Google

Gemini 3.1 Flash-Lite is Google's fastest, most cost-efficient model. Features 1M context, native multimodality, and 363 tokens/sec speed for scale.

1M context

$0.25/$1.50/1M

GPT-5.3 Instant

OpenAI

Explore GPT-5.3 Instant, OpenAI's "Anti-Cringe" model. Features a 128K context window, 26.8% fewer hallucinations, and a natural, helpful tone for everyday...

128K context

$1.75/$14.00/1M

Gemini 3.1 Pro

Google

Gemini 3.1 Pro is Google's elite multimodal model featuring the DeepThink reasoning engine, a 1M+ context window, and industry-leading ARC-AGI logic scores.

1M context

$2.50/$15.00/1M

Frequently Asked Questions About GLM-5

Find answers to common questions about GLM-5