New • DeepSeek-V4 is live in Kilo Code

Use DeepSeek V4 in Kilo Code

Open-source SOTA agentic coding with 1M context by default. Choose V4-Pro (1.6T total / 49B active) for frontier performance, or V4-Flash (284B / 13B active) for fast, cost-effective agent work.

How to use DeepSeek V4 in Kilo Code

Set up in minutes with your own API key

1

Get DeepSeek API Key

Sign up at DeepSeek Platform and get your API key.

2

Install Kilo Code

Add Kilo Code extension to VS Code, JetBrains IDE, or use the CLI.

3

Open Kilo Code Settings

Open Command Palette and search "Kilo Code" to access settings.

4

Add DeepSeek API Key

Go to BYOK providers and add your DeepSeek API key.

5

Select deepseek-v4-pro or deepseek-v4-flash

Pick the V4 model in the model selector and start coding with 1M context.

Two models, one API

Pick the right trade-off between raw capability and speed/cost

Flagship

DeepSeek-V4-Pro

1.6T total params49B active1M context
  • Open-source SOTA on agentic coding benchmarks
  • Leads all open models on world knowledge (trails only Gemini-3.1-Pro)
  • Rivals top closed-source models on Math / STEM / Coding
  • Thinking and Non-Thinking modes
Fast & economical

DeepSeek-V4-Flash

284B total params13B active1M context
  • Reasoning closely approaches V4-Pro
  • On par with V4-Pro on simple agent tasks
  • Faster response times, highly cost-effective API pricing
  • Same 1M context, same dual-mode support

Heads up: deepseek-chat and deepseek-reasoner will be fully retired after Jul 24, 2026 (they currently route to V4-Flash). Switch to deepseek-v4-pro or deepseek-v4-flash in Kilo Code today.

Why DeepSeek V4 in Kilo Code?

A structural leap in attention, context, and agentic capability

1M Context by Default

1M context is now the standard across all official DeepSeek services. Point Kilo Code at a full monorepo, spec, and dependency tree at once.

Sparse Attention (DSA)

Token-wise compression plus DeepSeek Sparse Attention delivers world-leading long context with drastically reduced compute and memory cost.

Agentic-Coding SOTA

V4-Pro is open-source SOTA on agentic coding benchmarks and already powers DeepSeek's in-house coding agent — purpose-built for tool-using workflows.

Thinking + Non-Thinking

Both V4-Pro and V4-Flash support dual modes. Turn on thinking for hard reasoning; turn it off for fast, cheap tool calls.

Open Weights • 101k+ Stars

V4-Pro and V4-Flash weights are published on HuggingFace, and DeepSeek has 101k+ GitHub stars. Self-host, fine-tune, or audit — no vendor lock-in.

Cost-Effective Frontier

V4-Flash gives you reasoning close to V4-Pro at a fraction of the price — and Kilo Code's BYOK means you pay DeepSeek's API rates with no markup.

Coming from the previous generation? Compare with DeepSeek V3.1 Terminus.

Trusted by developers at the world's most innovative companies

Related and alternative models in Kilo Code

Not just DeepSeek — explore 500+ models including other free and affordable options

Popular Frontier Models

Most-used flagship models on the Kilo leaderboard this week

AnthropicFlagship

Claude Opus 4.7

Anthropic's flagship for deep reasoning and complex refactors. A go-to on Kilo for code, plan, and debug modes.

OpenAIFrontier

GPT-5.4

OpenAI's latest frontier model. Strong at long-horizon planning and tool use, and a popular pick on Kilo for hard problems.

GoogleMultimodal

Gemini 3.1 Pro Preview

Google's newest Gemini Pro preview. Excellent at multimodal tasks, long-horizon agentic coding, and structured planning.

QwenOpen weight

Qwen3.6 Plus

Alibaba's newest open-weight flagship — a popular Kilo choice for teams who want frontier-level coding without vendor lock-in.

Affordable High-Performers

Frontier-level quality at a fraction of the cost

xAIFormer free tier

Grok Code Fast 1

The model that powered the free frontier era. Kilo Coders have been using over 700B tokens per month. Now at a remarkably low price point.

$0.20/M input
MiniMaxHigh performer

MiniMax M2.1

Competitive performance on practical coding benchmarks. Reliable for production use cases at a fraction of frontier model costs.

$0.27/M input
Z AIOpen weight

GLM 4-7

Open-weight model with strong agentic coding capabilities. Handles multi-phase implementation tasks with excellent context understanding.

$0.40/M input
Google4x cheaper than Pro

Gemini 3 Flash Preview

Outperforms Gemini 3 Pro on many benchmarks at 1/4 the cost. Excellent for high-volume coding tasks.

$0.50/M input
Anthropic3x cheaper than Sonnet

Claude Haiku 4.5

Similar coding performance to Claude Sonnet 4 at one-third the cost and more than twice the speed.

$1.00/M input

Agentic Engineering

Glide through your workflow with a mode for every step

Ask mode

A knowledgeable technical assistant focused on answering questions without changing your codebase

Frequently Asked Questions

Everything you need to know about using DeepSeek V4 with Kilo Code

Ready to code with DeepSeek V4?

Join 2.3M+ developers using Kilo Code. Open-source SOTA agentic coding with 1M context, available today.

1M context defaultV4-Pro & V4-FlashOpen weights on HuggingFace