Back to Cookbook

Right Sizer

Stop guessing CPU and memory limits — let actual usage data decide

Analyzes your pod resource usage over time and recommends properly sized requests and limits. Finds the pods running at 2% CPU with 4 cores requested, and the ones getting OOMKilled because limits are too low.

House RecipeWork2 min

INGREDIENTS

🐙GitHub💬Slack

PROMPT

Create a skill called "Right Sizer". Analyze Kubernetes pod resource usage and recommend proper sizing: 1. Run `kubectl top pods -n <namespace>` for current usage 2. Get current requests/limits from deployment specs 3. If Prometheus is available, query historical CPU and memory usage (p50, p95, p99) 4. Compare requested vs. actual for each pod/deployment For each deployment, recommend: - CPU request: p95 actual usage + 20% headroom - CPU limit: p99 actual usage + 50% headroom (or no limit for non-critical) - Memory request: p95 actual usage + 20% headroom - Memory limit: p99 actual usage + 30% headroom (memory is less elastic than CPU) Flag: - Over-provisioned: request > 2x actual p95 usage - Under-provisioned: actual p95 > 80% of limit - No limits set: resource contention risk Generate updated deployment YAMLs with the recommended values. Estimate the cost savings from right-sizing.

How It Works

Most Kubernetes pods are over-provisioned by 2-5x because nobody wants to

risk an OOMKill. This skill analyzes actual usage and recommends limits that

are safe but not wasteful.

What You Get

  • Current vs. actual resource usage for all pods in a namespace
  • Over-provisioned pods (wasting money) with recommended lower limits
  • Under-provisioned pods (risk of OOMKill/throttling) with recommended higher limits
  • Pods with no limits set (resource contention risk)
  • Total cluster waste estimate in dollars
  • Updated deployment YAMLs with recommended values

Setup Steps

  1. Ensure metrics-server is running in your cluster
  2. Tell your Claw which namespace(s) to analyze
  3. Review the recommendations
  4. Apply changes gradually (start with the most over-provisioned)

Tips

  • Metrics-server gives real-time snapshots; Prometheus gives historical data (better for sizing)
  • Always set requests (scheduling guarantee) even if you're flexible on limits
  • Watch for burstable workloads — they need higher limits than average usage suggests
  • Apply changes during low-traffic periods and monitor for a few days
  • Consider VPA (Vertical Pod Autoscaler) for automatic ongoing adjustment
Tags:#kubernetes#cost-optimization#resources#devops