Earth visualizer
Prompt: "Create an animation of the earth spinning in space"
Auto Efficient delivers 71% of published frontier completion at 72% lower cost on KiloBench.
Official KiloBench results for Auto Efficient and published frontier model evaluations.
72% cheaper
Auto Efficient vs published frontier average
$19.60 vs $70.40
Auto Efficient vs published frontier average
46.7% vs 65.6%
KiloBench
Four one-shot app prompts, compared across Auto Efficient and frontier models.
Prompt: "Create an animation of the earth spinning in space"
Prompt: "Create a 3D visualizer of a sportscar"




Prompt: "Create a physics simulator that allows you to drag and stack 3D blocks onto one another, and they tumble if they aren't balanced."




Prompt: "Create a game that lets you shoot basketballs into a hoop"




Official KiloBench metrics side by side.
Bottom line: Auto Efficient solves 208 / 445 tasks at $0.22 per task. Published frontier models average $70.40 per attempt, while Auto Efficient averages $19.60 per attempt.
Methodology
Every published number comes from running the model through the same Kilo agent harness, not a generic scaffold. Costs include reasoning tokens, accumulated context re-sends, and all agent loop overhead.
Higher is better. It measures the fraction of benchmark tasks the model completed end-to-end through Kilo's harness — not a synthetic scaffold.
Sticker per-token pricing tells you almost nothing. These costs include reasoning tokens, cumulative context re-sends, and all agent loop overhead from the actual Kilo pipeline.
Cost per attempt divided by completion rate. A model that is cheap but rarely completes tasks can cost more per solved task than one with a higher attempt price.
This comparison only shows promoted KiloBench results. Models without an official result remain visible, but their cells stay marked pending instead of using estimates.
The case for cost-efficient routing when frontier spend is not always justified.
Auto Efficient keeps routine work affordable while reserving expensive frontier models for tasks that actually need them.
Auto Efficient uses live session classification to route each request to the model that fits the work — not just the most expensive one available.
For exploratory work, refactoring, documentation, and straightforward coding tasks, the lower-cost tradeoff is often the right one.
Auto Efficient is not a single frozen model — it routes to benchmark-proven models that match the session type, so quality tracks the work rather than the price tag.
Auto Efficient
Let Kilo's Auto Efficient tier route your tasks intelligently. Session-aware routing picks the right model for the work — so you spend less without manually managing model selection.
Explore all four Auto Model tiers — Efficient, Frontier, Balanced, and Free — and how Kilo's session-aware routing works.
Learn moreBrowse officially promoted KiloBench scores: completion rates, cost per attempt, and performance data for AI coding models.
Learn moreSee which models developers actually reach for across coding, planning, debugging, and agent workflows in real time.
Learn moreAdd extra free inference every month so Auto Efficient has more room to route across the right models for each session.
Learn more