Skip to main content

Using Cerebras With Kilo Code

Cerebras is known for their ultra-fast AI inference powered by the Cerebras CS-3 chip, the world's largest and fastest AI accelerator. Their platform delivers exceptional inference speeds for large language models, making them ideal for interactive development workflows.

Website: https://cerebras.ai/

Getting an API Key

  1. Sign Up/Sign In: Go to the Cerebras Cloud Platform. Create an account or sign in.
  2. Navigate to API Keys: Access the API Keys section in your account dashboard.
  3. Create a Key: Click to generate a new API key. Give it a descriptive name (e.g., "Kilo Code").
  4. Copy the Key: Important: Copy the API key immediately. Store it securely.

Supported Models

Kilo Code supports the following Cerebras models:

  • gpt-oss-120b (Default) – High-performance open-source model optimized for fast inference
  • zai-glm-4.6 – Advanced GLM model with enhanced reasoning capabilities

Refer to the Cerebras documentation for detailed information on model capabilities and performance characteristics.

Configuration in Kilo Code

  1. Open Kilo Code Settings: Click the gear icon () in the Kilo Code panel.
  2. Select Provider: Choose "Cerebras" from the "API Provider" dropdown.
  3. Enter API Key: Paste your Cerebras API key into the "Cerebras API Key" field.
  4. Select Model: Choose your desired model from the "Model" dropdown.

Tips and Notes

  • Inference Speed: Cerebras models deliver some of the fastest inference speeds available, reducing wait times during development.
  • Open Source Models: Many Cerebras models are based on open-source architectures, optimized for their custom hardware.
  • Cost Efficiency: Fast inference can lead to better cost efficiency for interactive use cases.
  • Pricing: Refer to the Cerebras platform for current pricing information and available plans.