Top Open Source Models
The best open-source models from around the world, all available in Kilo Code.
Mistral: Devstral Small 2512 (free)
mistralai
Devstral Small 2512 is a high-performance code generation and agentic model developed by Mistral AI. It is provided free of charge in Kilo Code for a limited time. **Note:** prompts and completions may be logged by Mistral during the free period and used to improve the model.
DeepSeek: DeepSeek V3.1 Terminus
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's performance in coding and search agents. It is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config) The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows.
Kwaipilot: KAT-Coder-Pro V1
kwaipilot
KAT-Coder-Pro V1 is KwaiKAT's most advanced agentic coding model in the KAT-Coder series. Designed specifically for agentic coding tasks, it excels in real-world software engineering scenarios, achieving 73.4% solve rate on the SWE-Bench Verified benchmark. The model has been optimized for tool-use capability, multi-turn interaction, instruction following, generalization, and comprehensive capabilities through a multi-stage training process, including mid-training, supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), and scalable agentic RL.
MiniMax: MiniMax M2.1
minimax
MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency. Compared to its predecessor, M2.1 delivers cleaner, more concise outputs and faster perceived response times. It shows leading multilingual coding performance across major systems and application languages, achieving 49.4% on Multi-SWE-Bench and 72.5% on SWE-Bench Multilingual, and serves as a versatile agent “brain” for IDEs, coding tools, and general-purpose assistance. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).
Mistral: Devstral 2 2512 (free)
mistralai
Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring codebases and orchestrating changes across multiple files while maintaining architecture-level context. It tracks framework dependencies, detects failures, and retries with corrections—solving challenges like bug fixing and modernizing legacy systems. The model can be fine-tuned to prioritize specific languages or optimize for large enterprise codebases. It is available under a modified MIT license.
Qwen: Qwen3 Coder 480B A35B
qwen
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over repositories. The model features 480 billion total parameters, with 35 billion active per forward pass (8 out of 160 experts). Pricing for the Alibaba endpoints varies by context length. Once a request is greater than 128k input tokens, the higher pricing is used.
Open Source Never Sleeps
Innovation isn't limited to Silicon Valley. Open-source AI models come from labs and researchers across the globe, democratizing access to cutting-edge technology.
From DeepSeek and MiniMax in China to Falcon in the UAE, Singapore's SEAL-LION to India's Sarvam - the open source AI revolution is truly global. The Kilo community takes advantage of the best models from every corner of the world. It is about performance, transparency, speed and cost - OSS models tend to be free or low cost.
Open Source vs Open Weight: What's the Difference?
Understanding the terminology helps you make informed decisions about which models to use in your development workflow.
Open Source
Fully transparent - includes model weights, training code, datasets, and documentation. You can inspect, modify, and understand exactly how the model was built. Examples include smaller research models and community projects.
Open Weight
Model weights available - you can download and run the model, but training code and datasets may not be public. You still have freedom to use, modify, and deploy without restrictions. Most commercial "open" models fall into this category.
The Bottom Line: Both open-source and open-weight models give you freedom from vendor lock-in. You can run them locally, inspect their behavior, and use them without ongoing subscription fees. Kilo Code supports both types, giving you maximum flexibility.
Use Open Source Models Everywhere with Kilo Code
Kilo works where you work. Build so or with your engineering team.
Why more developers are switching to OSS Models
Open-source and open-weight models are getting serious. Here's why developers are choosing them.
Run Locally
Use Ollama or LM Studio to run models on your own hardware. No internet required, no API costs, complete privacy. Perfect for sensitive codebases.
No Vendor Lock-In
Your code never depends on a single provider. Switch between local and hosted, or between different models, without changing your workflow.
Rapidly Improving
Open-weight models are reaching the "convergence threshold" - performing comparably to closed models on real coding tasks. The gap is closing fast.
Community Driven
Benefit from community fine-tunes, optimizations, and improvements. Open models get better through collective effort.
Cost Effective
Run unlimited queries locally for free, or use hosted versions at competitive rates. No surprise bills, no usage caps.
Full Transparency
Understand model behavior, inspect outputs, and debug issues. No black boxes, no mystery changes to model behavior.
xAI: Open Sourcing the Future
xAI has made a commitment to open source their models when new versions are released, democratizing access to cutting-edge AI technology.
Kilo Code supports numerous Grok models from xAI, giving you access to powerful reasoning and coding capabilities. When xAI releases a new model, they open source the previous version, ensuring the community benefits from their research.
Plus, Kilo Code includes access to Grok Code Fast 1 - while not open source, it's completely free and delivers exceptional speed and power for coding tasks. It's one of the fastest coding models available.
Why This Matters: xAI's commitment to open sourcing models means you get access to state-of-the-art AI without vendor lock-in. Use Grok models locally or through Kilo Code's hosted service.
Chinese Open-Source Models Are Closing the Performance Gap
Open-weight models are no longer "good enough" - they're genuinely competitive with closed models for real coding work.
By the Numbers
According to Stanford's 2025 AI Index Report
The gap between top and 10th-ranked models fell from 11.9% to just 5.4% in one year. The frontier is increasingly competitive.
The top two models are now separated by just 0.7%. Chinese open-weight models like DeepSeek and Qwen are competing at the highest level.
Source: Stanford HAI AI Index 2025
Real-World Evidence
From our testing in Kilo Code
GLM 4.7 (z.AI)
Handles multi-phase implementation tasks, understands context, and produces working code. Strong agentic coding capabilities.
MiniMax M2.1
Demonstrates competitive performance on practical coding benchmarks. Reliable for production use cases.
DeepSeek V3
One of the most cost-effective models available, with performance rivaling closed-source alternatives at a fraction of the cost.
The Bottom Line: You can now use open-weight models for production coding work without compromising on quality. The convergence threshold has been crossed.
How to Use Open Source Models in Kilo Code
Three ways to run open-source models: locally, hosted, or bring your own keys.
Run Locally
Install Ollama or LM Studio, download your preferred model, and connect it to Kilo Code. Completely free, completely private.
Learn More →Use Hosted
Access 400+ models including all major open-weight models through Kilo Code's hosted service. Pay only for what you use, no markup.
All Models →Bring Your Own Keys
Connect your own API keys from providers like OpenRouter, Together AI, or direct model providers. Full control, full flexibility.
Setup Guides →