Context Condensing

Overview

When working on complex tasks, conversations with Kilo Code can grow long and consume a significant portion of the AI model's context window. Context Condensing is a feature that intelligently summarizes your conversation history, reducing token usage while preserving the essential information needed to continue your work effectively.

The Problem: Context Window Limits

Every AI model has a maximum context window - a limit on how much text it can process at once. As your conversation grows with code snippets, file contents, and back-and-forth discussions, you may approach this limit. When this happens, you might experience:

Slower responses as the model processes more tokens
Higher API costs due to increased token usage
Eventually hitting the context limit and being unable to continue

The Solution: Intelligent Condensing

Context Condensing solves this problem by creating a concise summary of your conversation that captures:

The original task or goal
Key decisions made during the session
Important code changes and their context
Current progress and next steps

This summary replaces the detailed conversation history, freeing up context window space while maintaining continuity in your work.

How Context Condensing Works

Automatic Triggering

Kilo Code monitors your context usage and may suggest condensing when you approach the context window limit. You'll see a notification indicating that condensing is recommended.

Manual Condensing

You can also trigger context condensing manually at any time using:

Chat Command: Type /condense in the chat
Settings: Access condensing options through the Context Condensing settings

The Condensing Process

When condensing is triggered:

Analysis: Kilo Code analyzes the entire conversation history
Summarization: A summary is generated using the configured API, capturing essential context
Replacement: The detailed history is replaced with the condensed summary
Continuation: You can continue working with the freed-up context space

Configuration Options

API Configuration

Context Condensing uses an AI model to generate summaries. You can configure which API to use for condensing operations:

Use the same API as your main coding assistant
Configure a separate, potentially more cost-effective API for condensing

Profile-Specific Settings

You can configure context condensing thresholds and behavior on a per-profile basis, allowing different settings for different projects or use cases.

Best Practices

When to Condense

Long sessions: If you've been working for an extended period on a complex task
Before major transitions: When switching to a different aspect of your project
When prompted: When Kilo Code suggests condensing due to context limits

Maintaining Context Quality

Be specific in your initial task: A clear task description helps create better summaries
Use AGENTS.md: Combine with AGENTS.md for persistent project context that doesn't need to be condensed
Review the summary: After condensing, the summary is visible in your chat history

Troubleshooting