Context Condensing
Overview
When working on complex tasks, conversations with Kilo Code can grow long and consume a significant portion of the AI model's context window. Context Condensing is a feature that intelligently summarizes your conversation history, reducing token usage while preserving the essential information needed to continue your work effectively.
The Problem: Context Window Limits
Every AI model has a maximum context window - a limit on how much text it can process at once. As your conversation grows with code snippets, file contents, and back-and-forth discussions, you may approach this limit. When this happens, you might experience:
- Slower responses as the model processes more tokens
- Higher API costs due to increased token usage
- Eventually hitting the context limit and being unable to continue
The Solution: Auto-Compaction
The new platform uses a Compaction system to manage context automatically. When your conversation approaches the token limit, compaction kicks in and produces a structured summary that captures:
- The overall goal of the session
- Key discoveries made along the way
- What has been accomplished so far
- Files that were modified
This summary replaces the earlier conversation history, freeing up context window space while maintaining continuity in your work.
How Compaction Works
Automatic Compaction
Compaction triggers automatically when the conversation reaches the usableWindow token threshold. The full conversation history is sent to a dedicated compaction agent, which produces a structured summary. This happens in the background without interrupting your workflow.
Context Pruning
In addition to compaction, the system can prune old tool outputs to reclaim context space incrementally. Tool results older than a 40,000-token recency window are replaced with "[Old tool result content cleared]". This is a lighter-weight mechanism that runs alongside full compaction.
Manual Compaction
You can also trigger compaction manually:
- CLI TUI: Press
<leader>cto compact the current session - Extension Webview: Send a
CompactRequestmessage to trigger compaction
There is no /condense chat command on the new platform. Use the keybinding or message-based invocation instead.
The Compaction Process
When compaction is triggered:
- Threshold Check: The system detects that context usage has reached the
usableWindowlimit - Agent Summarization: The full conversation history is sent to a dedicated compaction agent
- Structured Summary: The agent produces a summary covering the goal, discoveries, accomplishments, and modified files
- Replacement: The detailed history is replaced with the compacted summary
- Continuation: You continue working with the freed-up context space
Configuration Options
Compaction is configured in your kilo.jsonc file:
{
"compaction": {
"auto": true, // Enable or disable automatic compaction
"reserved": 4096, // Number of tokens to reserve (keep free) after compaction
"prune": true, // Enable pruning of old tool outputs beyond the recency window
},
}
| Option | Type | Description |
|---|---|---|
compaction.auto | boolean | Enable or disable automatic compaction when the context threshold is hit |
compaction.reserved | number | Number of tokens to reserve after compaction |
compaction.prune | boolean | Enable pruning of old tool outputs outside the 40K token recency window |
Best Practices
When to Condense
- Long sessions: If you've been working for an extended period on a complex task
- Before major transitions: When switching to a different aspect of your project
- When prompted: When Kilo Code suggests condensing or compaction due to context limits
Maintaining Context Quality
- Be specific in your initial task: A clear task description helps create better summaries
- Use AGENTS.md: Combine with AGENTS.md for persistent project context that doesn't need to be condensed
- Review the summary: After condensing or compaction, the summary is visible in your chat history
Related Features
- AGENTS.md - Persistent context storage across sessions
- Large Projects - Managing context for large codebases
- Codebase Indexing - Efficient code search and retrieval