Model leaderboard

AI Coding Models by Lab

Compare complete model lineups from the labs building today's strongest coding AI: frontier APIs, open-weight models, local-friendly families, and free hosted options in Kilo Code.

Labs20tracked in Kilo Code

Models75active catalog entries

Open Access12open or mixed labs

Browse500+models supported

Focused model guides

Browse labs here, or use a dedicated guide for open-weight rankings, free hosted models, local hardware, release freshness, and terminology.

Best open-weight coding models

Evidence-backed coding ranking and use-case picks.

Best free AI coding models

Live zero-price hosted models and tested free winners.

Best local LLM for coding

Hardware-first local model recommendations.

New open-weight models

Chronological verified release tracker.

Open source vs open weight

Definitions, licensing, and procurement checklist.

Showing 20 of 20 labs

Models by

Anthropic

Anthropic is an AI safety company founded in 2021 by former OpenAI researchers. Its Claude family is the dominant frontier alternative to GPT for software engineering, with Claude Opus and Claude Sonnet leading SWE-Bench Verified and Terminal-Bench 2.0 across multiple releases. Anthropic does not publish open weights; all access is via API.

Top coding index

Closed (API only)

Top listed model

Anthropic: Claude Opus 4.8

View Anthropic models

Models by

OpenAI

OpenAI is the lab behind the GPT family and the Codex coding-specialist line. GPT-5.5 is its most capable coding model in 2026; GPT-5.3-Codex and GPT-5.2-Codex are dedicated coding variants tuned for terminal and CLI work. OpenAI does not publish open weights for any GPT-class model; access is API-only.

Top coding index

Closed (API only)

Top listed model

OpenAI: GPT-5.6 Sol

View OpenAI models

Models by

Google

Google’s AI work spans two product families on Kilo Code: the closed-source Gemini line for frontier coding capability and the open-weight Gemma line for self-hosted use. Gemini 3.1 Pro competes directly with Claude Opus and GPT-5.5 on SWE-Bench and Terminal-Bench. Gemma 4 (26B and 31B) ships under Apache 2.0 and runs on consumer hardware.

Top coding index

Top listed model

Google: Gemma 4 26B A4B

View Google models

Models by

Tencent

Tencent’s Hunyuan models come from a product-scale AI organization with deep experience in cloud, gaming, social, and enterprise software. For developers, Hunyuan is most interesting as a broad hosted model family that can cover coding support alongside document, product, and workflow automation tasks.

Top coding index

Top listed model

Tencent: Hy3 (free)

View Tencent models

Models by

Xiaomi

Xiaomi is best known for consumer electronics, but its AI division also ships the MiMo family of open-weight language models. MiMo-V2.5-Pro is Xiaomi's current flagship for complex software engineering and long-horizon agentic tasks. MiMo-V2.5 offers Pro-level agentic performance at lower inference cost, while MiMo-V2-Pro and MiMo-V2-Flash round out the long-context and fast-response tiers.

Top coding index

Top listed model

Xiaomi: MiMo-V2.5

View Xiaomi models

Models by

Qwen

Qwen is Alibaba Cloud’s model family, known for fast iteration across dense, MoE, coder, math, and multimodal releases. The coding lineup usually pairs specialist Qwen Coder models with larger general Qwen releases, giving developers a wide spread of open-weight and hosted choices for cost-sensitive agent workflows.

Top coding index

Top listed model

Qwen: Qwen3.7 Plus (20% off)

View Qwen models

Models by

Mistral AI

Mistral AI is known for compact, efficient models and a pragmatic open-model strategy. Its Devstral line is explicitly aimed at software engineering agents, while Medium and larger hosted models provide stronger general reasoning for teams that want European infrastructure and flexible deployment choices.

Top coding index

Top listed model

Mistral: Mistral Medium 3.5

View Mistral AI models

Models by

Z.ai

Z.ai, formerly associated with the GLM and Zhipu ecosystem, focuses on bilingual reasoning models with strong tool-use and agentic coding behavior. Its GLM line tends to emphasize practical developer workflows: long context, low-latency variants, and open-weight options that compete with larger closed models on coding tasks.

Top coding index

Top listed model

Z.ai: GLM 5.2

View Z.ai models

Models by

DeepSeek

DeepSeek is the Chinese AI lab whose V4 series leads every published open-source coding benchmark in 2026. DeepSeek V4-Pro is a 1.6T-parameter MoE model with a true 1M-token context window and the highest published scores on LiveCodeBench (93.5) and Codeforces (3206) of any model, open or closed. All weights ship under MIT.

Top coding index

Top listed model

DeepSeek: DeepSeek V3.1 Terminus

View DeepSeek models

Models by

xAI

xAI builds the Grok family, which tends to prioritize fast interactive reasoning and a direct style that works well for product-building loops. Grok Code Fast is the recognizable specialist release for coding agents, while larger Grok models cover broader reasoning and implementation tasks.

Top coding index

Closed (API only)

Top listed model

xAI: Grok 4.3

View xAI models

Models by

inclusionAI

inclusionAI is Ant Group’s AI lab, with Ling and Ring models designed around efficient long-context reasoning, multilingual use, and high-throughput production workloads. Its coding models are interesting when you want frontier-scale context and fast hosted inference without defaulting to the largest Western labs.

Top coding index

Top listed model

inclusionAI: Ling-2.6-1T (free)

View inclusionAI models

Models by

MiniMax

MiniMax builds high-throughput foundation models that often punch above their price for agentic development work. The M-series emphasizes practical coding loops: fast responses, broad instruction following, and enough reasoning depth for multi-file edits without frontier-model pricing.

Top coding index

Closed (API only)

Top listed model

MiniMax: MiniMax M3

View MiniMax models

Models by

MoonshotAI

MoonshotAI is best known for Kimi, a model family built around long-context reasoning and document-heavy workflows. Its Kimi coding releases are especially recognizable for combining very large context windows with pragmatic software-engineering ability, making them useful for repository-wide analysis and agent tasks.

Top coding index

Top listed model

MoonshotAI: Kimi K2.7 Code

View MoonshotAI models

Models by

NVIDIA

NVIDIA’s Nemotron line reflects the company’s hardware-plus-model approach: models are optimized for fast inference on GPU infrastructure and often released with permissive options for enterprise deployment. For coding, Nemotron is useful when throughput, cost, and deployability matter as much as benchmark rank.

Top coding index

Top listed model

NVIDIA: Nemotron 3 Super (free)

View NVIDIA models

Models by

Poolside

Poolside is unusual because it is focused almost entirely on software engineering rather than general chat. Its approach centers on code execution, repository tasks, and developer workflows, making Laguna models easy to recognize as coding-native systems rather than general-purpose models adapted to code.

Top coding index

Closed (API only)

Top listed model

Poolside: Laguna M.1

View Poolside models

Models by

Kwaipilot

Kwaipilot is a coding-agent line associated with Kwai/Kuaishou’s AI work. The KAT-Coder releases are recognizable because they are aimed directly at repository automation and programming tasks rather than general chat, making them useful to test against other coding-specialist models.

Top coding index

Closed (API only)

Top listed model

Kwaipilot: KAT-Coder-Pro V1

View Kwaipilot models

Models by

Baidu

Baidu’s model work is rooted in the ERNIE ecosystem and large-scale search, cloud, and developer tooling. The CoBuddy-style coding releases are aimed at practical programming assistance, making Baidu recognizable for pairing foundation-model research with IDE and assistant-product workflows.

Top coding index

Top listed model

baidu/cobuddy:free

View Baidu models

Models by

Nex AGI

Nex AGI is an emerging provider in the Kilo catalog with models positioned around agent-style coding help and free hosted access. Its current appeal is experimentation: developers can try a new agentic model family without committing to a premium frontier provider.

Top coding index

Closed (API only)

Top listed model

Nex AGI: Nex-N2-Pro (free)

View Nex AGI models

Models by

OpenRouter

OpenRouter is primarily a model routing and provider marketplace rather than a single research lab. On Kilo Code, OpenRouter-branded entries usually represent experimental, community, or platform-hosted models that are easy to try alongside major lab releases through the same model selector.

Top coding index

Top listed model

Pony Alpha (free)

View OpenRouter models

Models by

StepFun

StepFun builds the Step model family with an emphasis on responsive hosted assistants and multimodal product use. In Kilo Code, Step flash-style models are best treated as fast iteration engines for lightweight coding edits, review passes, and sub-agent calls.

Top coding index

Closed (API only)

Top listed model

StepFun: Step 3.5 Flash

View StepFun models

Looking for open-weight models?

DeepSeek, Google Gemma, Xiaomi MiMo, and other open-weight lineups can also be compared by license and local deployment fit.