GLM-5.2Flagship

CurrentVerified · Jun 27, 2026

Current flagship reasoning/coding model with a usable 1M-token context, MIT open-weight (HF zai-org/GLM-5.2, ~753B params). docs.z.ai/guides/llm/glm-5.2 model id verbatim: "model": "glm-5.2" and overview: 'GLM-5.2 is a flagship model built for the era of long-horizon tasks. With truly usable 1M-toke

profile normalized against the 70-model field

Context window· 1M of 10M10%

Max output· 128K of 384K33%

Output speed—

Affordability· $4.40 / Mtok out98%

Capability breadth· 2 of 1118%

Capability switches · 2 of 11

Reasoning mode

Tool / function use

Streaming

JSON mode

Structured outputs

Prompt caching

Fine-tuning

Web search

Code execution

Vision input

Audio input

Specifications

Every value carries a primary source and a verification date.

Capacity

Context window

Parameters

753B

Max output

128K

Pricing

Input $/Mtok

$1.40 / 1M input tokens USD per 1M tokens

Cached input $/Mtok

$0.26 / Mtok USD/Mtok

Output $/Mtok

$4.40 / 1M output tokens USD per 1M tokens

Capabilities

Reasoning mode

Yes

Tool / function use

Yes

Vision input

API

API model ID

glm-5.2

General

Release date

June 24, 2026

Open weights

Yes

License

MIT

Benchmarks

Sourced evaluation scores, each verified against its primary source.

GPQA Diamond

| GPQA-Diamond | 91.2 | 86.2 | 90.0 | 93.0 | 90.1 | 93.6 | 93.6 | 94.3 |

91.2 %Verified

AIME 2026

| AIME 2026 | 99.2 | 95.3 | 97.0 | - | 94.6 | 95.7 | 98.3 | 98.2 |

99.2 %Verified

HMMT Nov. 2025

| HMMT Nov. 2025 | 94.4 | 94.0 | 95.0 | 84.4 | 94.4 | 96.5 | 96.5 | 94.8 |

94.4 %Verified

HMMT Feb. 2026

| HMMT Feb. 2026 | 92.5 | 82.6 | 97.1 | 84.4 | 95.2 | 96.7 | 96.7 | 87.3 |

92.5 %Verified

HLE (Humanity's Last Exam)

| HLE | 40.5 | 31.0 | 41.4 | 37.0 | 37.7 | 49.8* | 41.4* | 45.0 |

40.5 %Verified

HLE (w/ Tools)

| HLE w/ Tools | 54.7 | 52.3 | 53.5 | - | 48.2 | 57.9* | 52.2* | 51.4* |

54.7 %Verified

CritPt

| CritPt | 20.9 | 4.6 | 13.4 | 3.7 | 12.9 | 20.9 | 27.1 | 17.7 |

20.9 %Verified

IMOAnswerBench

| IMOAnswerBench | 91.0 | 83.8 | 90.0 | - | 89.8 | 83.5 | - | 81.0 |

91 %Verified

SWE-bench Pro

| SWE-bench Pro | 62.1 | 58.4 | 60.6 | 59.0 | 55.4 | 69.2 | 58.6 | 54.2 |

62.1 %Verified

Terminal-Bench 2.1 (Terminus-2)

| Terminal Bench 2.1 Terminus-2 | 81.0 | 63.5 | 75.0 | 65.0 | 64.0 | 85.0 | 84.0 | 74.0 |

81 %Verified

FrontierSWE (Dominance as of 26/6/16)

| FrontierSWE Dominance as of 26/6/16 | 74.4 | 30.5 | - | - | 29.0 | 75.1 | 72.6 | 39.6 |

74.4 %Verified

NL2Repo

| NL2Repo | 48.9 | 42.7 | 47.2 | 42.1 | 35.5 | 69.7 | 50.7 | 33.4 |

48.9 %Verified

DeepSWE

| DeepSWE | 46.2 | 18.0 | 18.0 | 20.0 | 8.0 | 58.0 | 70.0 | 10.0 |

46.2 %Verified

ProgramBench

| ProgramBench | 63.7 | 50.9 | - | - | 47.8 | 71.9 | 70.8 | 39.5 |

63.7 %Verified

MCP-Atlas (Public Set)

| MCP-Atlas Public Set | 76.8 | 71.8 | 76.4 | 74.2 | 73.6 | 77.8 | 75.3 | 69.2 |

76.8 %Verified

Loading…

GLM-5.2

CurrentVerified · Jun 27, 2026

Specifications

Every value carries a primary source and a verification date.

Capacity

Context window

Parameters

753B

Max output

128K

Pricing

Input $/Mtok

$1.40 / 1M input tokens USD per 1M tokens

Cached input $/Mtok

$0.26 / Mtok USD/Mtok

Output $/Mtok

$4.40 / 1M output tokens USD per 1M tokens

Capabilities

Reasoning mode

Yes

Tool / function use

Yes

Vision input

API

API model ID

glm-5.2

General

Release date

June 24, 2026

Open weights

Yes

License

MIT

Benchmarks

Sourced evaluation scores, each verified against its primary source.

GPQA Diamond

| GPQA-Diamond | 91.2 | 86.2 | 90.0 | 93.0 | 90.1 | 93.6 | 93.6 | 94.3 |

91.2 %Verified

AIME 2026

| AIME 2026 | 99.2 | 95.3 | 97.0 | - | 94.6 | 95.7 | 98.3 | 98.2 |

99.2 %Verified

HMMT Nov. 2025

| HMMT Nov. 2025 | 94.4 | 94.0 | 95.0 | 84.4 | 94.4 | 96.5 | 96.5 | 94.8 |

94.4 %Verified

HMMT Feb. 2026

| HMMT Feb. 2026 | 92.5 | 82.6 | 97.1 | 84.4 | 95.2 | 96.7 | 96.7 | 87.3 |

92.5 %Verified

HLE (Humanity's Last Exam)

| HLE | 40.5 | 31.0 | 41.4 | 37.0 | 37.7 | 49.8* | 41.4* | 45.0 |

40.5 %Verified

HLE (w/ Tools)

| HLE w/ Tools | 54.7 | 52.3 | 53.5 | - | 48.2 | 57.9* | 52.2* | 51.4* |

54.7 %Verified

CritPt

| CritPt | 20.9 | 4.6 | 13.4 | 3.7 | 12.9 | 20.9 | 27.1 | 17.7 |

20.9 %Verified

IMOAnswerBench

| IMOAnswerBench | 91.0 | 83.8 | 90.0 | - | 89.8 | 83.5 | - | 81.0 |

91 %Verified

SWE-bench Pro

| SWE-bench Pro | 62.1 | 58.4 | 60.6 | 59.0 | 55.4 | 69.2 | 58.6 | 54.2 |

62.1 %Verified

Terminal-Bench 2.1 (Terminus-2)

| Terminal Bench 2.1 Terminus-2 | 81.0 | 63.5 | 75.0 | 65.0 | 64.0 | 85.0 | 84.0 | 74.0 |

81 %Verified

FrontierSWE (Dominance as of 26/6/16)

| FrontierSWE Dominance as of 26/6/16 | 74.4 | 30.5 | - | - | 29.0 | 75.1 | 72.6 | 39.6 |

74.4 %Verified

NL2Repo

| NL2Repo | 48.9 | 42.7 | 47.2 | 42.1 | 35.5 | 69.7 | 50.7 | 33.4 |

48.9 %Verified

DeepSWE

| DeepSWE | 46.2 | 18.0 | 18.0 | 20.0 | 8.0 | 58.0 | 70.0 | 10.0 |

46.2 %Verified

ProgramBench

| ProgramBench | 63.7 | 50.9 | - | - | 47.8 | 71.9 | 70.8 | 39.5 |

63.7 %Verified

MCP-Atlas (Public Set)

| MCP-Atlas Public Set | 76.8 | 71.8 | 76.4 | 74.2 | 73.6 | 77.8 | 75.3 | 69.2 |

76.8 %Verified