GLM-4.7

External

GLM-4.7 is a powerful open-weight language model from Z.ai, optimized for advanced coding, agentic workflows, and creative UI generation. It sets new benchmarks with 73.8% on SWE-bench Verified, 87.4% on τ²-Bench for tool use, and innovative thinking modes like Interleaved, Preserved, and Turn-level for superior reasoning. Ideal for developers building coding agents, multilingual teams, and budget-conscious users seeking high-accuracy performance with local deployment flexibility.

Pricing

Starting at USD18/quarterlyView pricing

CategoryCoding & Development

0.0/5

0 reviews

Description

Key capabilities

Exceptional coding performance (SWE-bench Verified 73.8%, Terminal Bench 41%)
Strong tool use and agentic reasoning (τ²-Bench 87.4%)
Advanced thinking modes (Interleaved, Preserved, Turn-level)
Long context length up to 200K tokens
Multilingual coding support

Core use cases

1.Building coding agents and terminals
2.Generating UI, webpages, and slides
3.Creating interactive WebGL/3D content
4.Multilingual software engineering
5.Complex multi-turn agent workflows
6.Visual content like posters and portfolios

Is GLM-4.7 Right for You?

Best for

Developers building coding agents
Multilingual coding teams
Budget-conscious users with local hosting
Teams needing stable multi-turn reasoning

Not ideal for

Low-latency response applications
High-volume repetitive tasks
Speed or cost-per-token sensitive workloads

Standout features

Interleaved Thinking for better instruction following
Preserved Thinking for multi-turn stability
Turn-level Thinking for latency trade-offs
Open weights on HuggingFace for local inference
API access via Z.ai and OpenRouter
Supports vLLM, SGLang frameworks

Pricing

Pro

USD90

Free

USD0

Lite

USD18

Max

USD180

Reviews

0.0/5

Based on 0 reviews across 0 platforms

User Feedback Highlights

Most Praised

Top-tier coding benchmark results
Reliable tool calls and multi-step reasoning
Cost-effective open-source deployment
Improved UI generation and creative writing
Gains in multilingual and terminal tasks

Common Complaints

Flash variant weaker on complex prompts
Higher latency and costs for full model
Inconsistencies in long-horizon tasks
Bugs in reasoning token handling
Verbose output increases token usage
Occasional tool-calling issues