Z.AI’s new GLM-5.2 is here — and it’s turning heads for performance, price, and the chips it didn’t use. What happened - On June 16 Beijing-based Z.AI released GLM-5.2, a major upgrade over GLM-5.1. The announcement, coming after the recent U.S. ban on Anthropic’s Fable family, helped send Z.AI’s stock up roughly 90% to a fresh all-time high. - Z.AI has been on the U.S. Entity List since January 2025, making GLM-5.2’s arrival politically and commercially notable as an example of alternative AI supply chains. How it stacks up - FrontierSWE (dominance rate, large-scale technical projects): GLM-5.2 scored 74.4. Claude Opus 4.8: 75.1. GPT-5.5: 72.6. - SWE-bench Pro (real-world GitHub issue resolution, pass rate): GLM-5.2 scored 62.1, ahead of GPT-5.5’s 58.6 and far above GLM-5.1’s 58.4. - In the Artificial Analysis Intelligence Index (9 aggregated scores) GLM-5.2 becomes the best open-source model to date. OpenRouter benchmarks place it in the same performance class as the now-banned Claude Fable 5. - One remaining gap: on SWE-Marathon (sustained, hardest engineering tasks) GLM-5.2 scored 13.0 versus Opus 4.8’s 26.0 — the closed-source frontier still leads on the longest, heaviest workloads. Why the hardware angle matters - GLM-5.2 was trained on Huawei’s Ascend family (Ascend Atlas servers). There are no Nvidia chips in its pipeline — a meaningful point amid export controls and U.S.-China tech tensions. - Stability AI founder Emad Mostaque estimated total training costs at roughly $25 million, with about 80% of those costs in post-training — implying substantially lower training overhead than many peers. Model specs that developers care about - Size and architecture: 744-billion-parameter mixture-of-experts model. - Context window: a genuine 1,000,000-token window (up from GLM-5.1’s 200K). This enables single-call workflows for whole-repo navigation, multi-file refactors, and long agentic pipelines that previously required chunking. - License: weights released under the MIT license — broad reuse allowed and resistant to vendor or government “access switches.” Pricing and developer access - API pricing: $1.40 per million input tokens and $4.40 per million output tokens. For comparison, Claude Opus 4.8 lists $5 per million input and $25 per million output. - GLM Coding Plan starts at about $18/month and integrates with Claude Code, Cline, Kilo Code and popular agentic environments. - Free testing: GLM-5.2 is available for limited free trials on z.AI. Full open-source weights and the quantized variants are hosted on HuggingFace under the MIT license; Coding Plan users can call the model with the string GLM-5.2. Local deployment and quantization - Unsloth AI produced a 2-bit GGUF quantization that compresses the 1.51 TB model down to ~238 GB while retaining ~82% of its original accuracy. - Local running remains demanding: even the quantized build needs around 256 GB of unified memory or a RAM/VRAM combination capable of offloading MoE layers (examples include a maxed M4 Ultra Mac Studio or a workstation with ~256 GB system RAM and a mid-range GPU). - That makes self-hosting feasible for some teams and labs, but it’s still an expensive setup for casual users. Real-world feel - In a quick zero-shot test building a small game combining typing mechanics and shooter elements, GLM-5.2 produced the most varied gameplay states we’ve seen in this class of model — more scenario diversity and shifting enemy/boss behavior, though UI polish lagged behind some rivals. That strength in diversity maps to where GLM-5.2 is economically compelling: multi-shot generation workflows and agentic pipelines where varied outputs beat single-output polish. What it means for crypto and dev communities - Lower API costs and an MIT license make GLM-5.2 attractive to open-source projects, tooling developers, and teams building agentic or programmatic AI workflows — including tooling for smart contracts, audits, bot infrastructure, and developer automation. - The use of non-U.S. hardware and the model’s licensing raise strategic points about supply-chain diversification and resilience amid export controls and geopolitical friction. Bottom line GLM-5.2 is a leap forward for open-source large models: strong benchmark results, huge context windows, cheap-ish API pricing, and MIT-licensed weights. It doesn’t fully close the gap to the closed-source leaders on the longest engineering marathons, but it’s an important and commercially disruptive step — especially given it was built without Nvidia chips. The weights (and compressed variants) are on HuggingFace, available now for developers and teams to test and deploy. Read more AI-generated news on: undefined/news