Z.AI releases GLM-5.2: a 753B open-weight model with 1M context and 2.9x FLOPs reduction via IndexShare

No audio yetJun 16, 2026published Jun 16, 2026

Z.AI has released GLM-5.2, its latest open-weight flagship model for long-horizon tasks. The 753B-parameter model is available under an MIT license with no regional restrictions and ships with a stable 1M-token context window, marking a significant step forward from GLM-5.1.

What's new

GLM-5.2 introduces several technical advances over its predecessors:

IndexShare architecture: A new attention mechanism that reuses the same indexer across every four sparse attention layers, cutting per-token FLOPs by 2.9× at a 1M context length — a meaningful efficiency gain for inference at scale.
1M-token context: The model sustains long-horizon work across the full 1M window, not just in theory but in stable operation.
Multiple thinking effort levels: Configurable reasoning depth lets operators trade off between peak performance and lower latency, depending on the task.
MIT license, no regional limits: Available without geographic access restrictions, positioning it as a globally accessible open alternative to proprietary frontier models.

Benchmark results are competitive with frontier-class models:

AIME 2026: 99.2%
GPQA-Diamond: 91.2%
SWE-bench Pro: 62.1%
Terminal Bench 2.1: 82.7%
MCP-Atlas (agentic): 76.8%

The model is available on Hugging Face under the zai-org/GLM-5.2 namespace and on Z.AI's API platform.

Context

GLM-5.2 is the third major iteration in Z.AI's GLM-5 family. GLM-5 launched as Z.AI's initial open-weight offering; GLM-5.1 added long-horizon task capability and was benchmarked against Claude Opus 4.6 and GPT on SWE-Bench Pro. GLM-5.2 extends that trajectory with the IndexShare efficiency improvement and a solidified 1M context.

The GLM series has positioned itself as China's most capable domestically developed open-weight model family, with Z.AI now publishing the underlying technical work (arxiv: 2602.15763) alongside the weights release.

OpenRouter's model registry picked up the GLM-5.2 release today, making the model immediately accessible through third-party inference providers in addition to Z.AI's own platform.

Why it matters

The 2.9× FLOPs reduction via IndexShare is the headline technical contribution. Running a 753B model at 1M context is computationally expensive; an architecture that cuts per-token compute by nearly 3× at that context length changes the economic picture for inference providers and self-hosters alike.

The MIT license without regional restrictions is a deliberate positioning move. Several competing open models carry usage restrictions or regional carve-outs that limit enterprise adoption. Z.AI is explicitly removing those friction points.

At 62.1% on SWE-bench Pro, GLM-5.2 is competitive with leading proprietary models on software engineering tasks — a benchmark category that directly correlates with practical developer utility. For teams that want frontier-class coding performance from a model they can deploy locally or through their own infrastructure, GLM-5.2 is now a credible option.

Corroborating sources

Huggingface.co
https://huggingface.co/zai-org/GLM-5.2
“A solid 1M-token context that stably sustains long-horizon work”

What's new

GLM-5.2 introduces several technical advances over its predecessors:

IndexShare architecture: A new attention mechanism that reuses the same indexer across every four sparse attention layers, cutting per-token FLOPs by 2.9× at a 1M context length — a meaningful efficiency gain for inference at scale.

1M-token context: The model sustains long-horizon work across the full 1M window, not just in theory but in stable operation.

Multiple thinking effort levels: Configurable reasoning depth lets operators trade off between peak performance and lower latency, depending on the task.

MIT license, no regional limits: Available without geographic access restrictions, positioning it as a globally accessible open alternative to proprietary frontier models.

Benchmark results are competitive with frontier-class models:

AIME 2026: 99.2%

GPQA-Diamond: 91.2%

SWE-bench Pro: 62.1%

Terminal Bench 2.1: 82.7%

MCP-Atlas (agentic): 76.8%

The model is available on Hugging Face under the zai-org/GLM-5.2 namespace and on Z.AI's API platform.

Context

OpenRouter's model registry picked up the GLM-5.2 release today, making the model immediately accessible through third-party inference providers in addition to Z.AI's own platform.

Why it matters