Z.ai: GLM 4.6
Z.ai
Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...
Builder of GLM models for agentic coding and long-context work. Z.ai’s coding models on Kilo Code include GLM 4.6, GLM 4.7, GLM 4.7 Flash, and GLM 5. Use them across VS Code, JetBrains IDEs, Cursor, Windsurf, Trae, and the Kilo CLI — with pay-as-you-go pricing and no markup over the underlying provider rates.
Sorted by coding-index where published. Click any model for the full review with benchmarks, real-world Kilo usage, and provider-specific pricing.
Z.ai
Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...
Z.ai
GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while...
Z.ai
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...
Z.ai
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading...
Z.ai
GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...
Z.ai
GLM 5.2 is a large-scale reasoning model from Z.ai. It supports text input and output with a 1M-token context window, and is suited for long-horizon agent workflows, project-level software engineering,...
Z.ai, formerly associated with the GLM and Zhipu ecosystem, focuses on bilingual reasoning models with strong tool-use and agentic coding behavior. Its GLM line tends to emphasize practical developer workflows: long context, low-latency variants, and open-weight options that compete with larger closed models on coding tasks.
Pay-as-you-go, no markup over the underlying provider rates. Cheapest first.
| Model | Input / 1M | Output / 1M | Context | Coding index |
|---|---|---|---|---|
| Z.ai: GLM 4.7 Flash | $0.060 | $0.400 | 203K | — |
| Z.ai: GLM 4.7 | $0.400 | $1.75 | 203K | — |
| Z.ai: GLM 4.6 | $0.430 | $1.74 | 203K | — |
| Z.ai: GLM 5 | $1.00 | $3.20 | 203K | — |
| Z.ai: GLM 5.1 | $1.40 | $4.40 | 203K | — |
| Z.ai: GLM 5.2 (new) | $1.40 | $4.40 | 1049K | — |
Three ways: hosted in Kilo, locally on your hardware, or through your own provider keys.
The fastest path: install Kilo Code, sign in, pick z.ai from the model picker. No API keys, no markup. Works in VS Code, JetBrains, Cursor, Windsurf, Trae, and the Kilo CLI.
See live model leaderboard →Already have an account with Z.ai, OpenRouter, AWS Bedrock, Google Vertex, Together AI, or another compatible provider? Plug your key into Kilo Code and keep your existing billing relationship.
BYOK setup guide →Download Z.ai open weights from Hugging Face and serve them with Ollama, LM Studio, vLLM, or SGLang. Connect Kilo Code to your local OpenAI-compatible endpoint and keep all prompts on hardware you control.
Local setup guide →See coding-model lineups from Z.ai’s closest competitors.
Z.ai’s flagship is among the strongest coding models on the Kilo Code leaderboard, ranked by Code, Plan, Ask, Debug, and Review usage.
VS Code, Cursor, Windsurf, Trae, JetBrains IDEs (IntelliJ, PyCharm, WebStorm, GoLand, RubyMine, Android Studio), and the Kilo CLI / terminal.
Switch between Z.ai and 500+ other models with one click. Pay only for what you use, at the underlying provider rate.
The newest GLM model with the highest coding index is usually the best Z.ai choice in Kilo Code. Use the lab table to compare GLM variants by coding score, context, and price.
Z.ai models work across VS Code, Cursor, Windsurf, Trae, JetBrains IDEs, and the Kilo CLI.
Z.ai models in Kilo Code are billed at provider rates with no markup. Check the pricing table for the current input and output token rates for each GLM variant.
Mixed. Z.ai publishes open-weight GLM models for self-hosting while also offering hosted API variants through providers such as OpenRouter.
The largest GLM releases target deeper reasoning and agentic coding, while flash or lightweight variants prioritize lower latency and cheaper iteration loops.
Z.ai models work across VS Code, Cursor, Windsurf, Trae, JetBrains IDEs, and the Kilo CLI.
Install Kilo Code and get instant access to Z.ai: GLM 4.6 and 5 other Z.ai models, plus 500+ frontier and open-source options. Free to start, no credit card required.