Google: Gemini 3.5 Flash
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...
Maker of Gemini and the Gemma open-weight family. Google’s coding models on Kilo Code include Gemini 3.5 Flash, Gemini 2.5 Flash, Gemini 3 Flash Preview, and Gemini 3 Pro Preview. Use them across VS Code, JetBrains IDEs, Cursor, Windsurf, Trae, and the Kilo CLI — with pay-as-you-go pricing and no markup over the underlying provider rates.
Sorted by coding-index where published. Click any model for the full review with benchmarks, real-world Kilo usage, and provider-specific pricing.
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...
Gemini 3 Pro is Google’s flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window. Reasoning Details must be preserved when using multi-turn tool calling, see our docs here: https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks. It delivers state-of-the-art benchmark results in general reasoning, STEM problem solving, factual QA, and multimodal understanding, including leading scores on LMArena, GPQA Diamond, MathArena Apex, MMMU-Pro, and Video-MMMU. Interactions emphasize depth and interpretability: the model is designed to infer intent with minimal prompting and produce direct, insight-focused responses. Built for advanced development and agentic workflows, Gemini 3 Pro provides robust tool-calling, long-horizon planning stability, and strong zero-shot generation for complex UI, visualization, and coding tasks. It excels at agentic coding (SWE-Bench Verified, Terminal-Bench 2.0), multimodal analysis, and structured long-form tasks such as research synthesis, planning, and interactive learning experiences. Suitable applications include autonomous agents, coding assistants, multimodal analytics, scientific reasoning, and high-context information processing.
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...
Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...
Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...
Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...
Google’s AI work spans two product families on Kilo Code: the closed-source Gemini line for frontier coding capability and the open-weight Gemma line for self-hosted use. Gemini 3.1 Pro competes directly with Claude Opus and GPT-5.5 on SWE-Bench and Terminal-Bench. Gemma 4 (26B and 31B) ships under Apache 2.0 and runs on consumer hardware.
Pay-as-you-go, no markup over the underlying provider rates. Cheapest first.
| Model | Input / 1M | Output / 1M | Context | Coding index |
|---|---|---|---|---|
| Google: Gemma 4 26B A4B (free) | Free | Free | 262K | — |
| Google: Gemma 4 26B A4B | $0.060 | $0.330 | 262K | — |
| Google: Gemma 4 31B | $0.120 | $0.350 | 262K | — |
| Google: Gemini 2.5 Flash | $0.300 | $2.50 | 1049K | — |
| Google: Gemini 3 Flash Preview | $0.500 | $3.00 | 1049K | — |
| Google: Gemini 3.5 Flash | $1.50 | $9.00 | 1049K | 70.1 |
| Google: Gemini 3 Pro Preview | $2.00 | $12.00 | 1049K | — |
| Google: Gemini 3.1 Pro Preview | $2.00 | $12.00 | 1049K | — |
Three ways: hosted in Kilo, locally on your hardware, or through your own provider keys.
The fastest path: install Kilo Code, sign in, pick google from the model picker. No API keys, no markup. Works in VS Code, JetBrains, Cursor, Windsurf, Trae, and the Kilo CLI.
See live model leaderboard →Already have an account with Google, OpenRouter, AWS Bedrock, Google Vertex, Together AI, or another compatible provider? Plug your key into Kilo Code and keep your existing billing relationship.
BYOK setup guide →Download Google open weights from Hugging Face and serve them with Ollama, LM Studio, vLLM, or SGLang. Connect Kilo Code to your local OpenAI-compatible endpoint and keep all prompts on hardware you control.
Local setup guide →See coding-model lineups from Google’s closest competitors.
Google’s flagship is among the strongest coding models on the Kilo Code leaderboard, ranked by Code, Plan, Ask, Debug, and Review usage.
VS Code, Cursor, Windsurf, Trae, JetBrains IDEs (IntelliJ, PyCharm, WebStorm, GoLand, RubyMine, Android Studio), and the Kilo CLI / terminal.
Switch between Google and 500+ other models with one click. Pay only for what you use, at the underlying provider rate.
Gemini 3.1 Pro is the strongest Google model for coding on Kilo Code, with the deepest reasoning and the largest context window. Gemini 3 Flash is the fast/cheap option for high-volume work. Gemma 4 31B is the best open-weight option you can run locally.
Every Google model works in all Kilo-supported surfaces: VS Code, Cursor, Windsurf, Trae, JetBrains IDEs, and the Kilo CLI. Gemma additionally runs on local hardware through any OpenAI-compatible runtime.
Gemini 3.1 Pro is $2.00 per million input and $12.00 per million output. Gemini 3 Flash is $0.50 / $3.00. Gemini 2.5 Flash is $0.30 / $2.50. Gemma 4 26B is free hosted; the paid variant is $0.06 / $0.33. Gemma weights are downloadable for free.
Mixed. Gemini is closed-source and API-only. Gemma is fully open-weight under Apache 2.0; you can download the 26B and 31B variants from Hugging Face and run them locally with Ollama, LM Studio, vLLM, or SGLang, or use them through Kilo Code’s hosted gateway.
Gemini 3.1 Pro is the flagship reasoning model. Gemini 3 Pro is the previous-generation flagship. Gemini 3 Flash is a smaller, faster, cheaper variant. Gemma 4 is a separate open-weight family aimed at local and edge deployment, with 26B and 31B sizes available.
Every Google model works in all Kilo-supported surfaces: VS Code, Cursor, Windsurf, Trae, JetBrains IDEs, and the Kilo CLI. Gemma additionally runs on local hardware through any OpenAI-compatible runtime.
Install Kilo Code and get instant access to Google: Gemini 3.5 Flash and 7 other Google models, plus 500+ frontier and open-source options. Free to start, no credit card required.