Back to Cookbook

Pod Doctor

Diagnose CrashLoopBackOff and friends without the kubectl marathon

Give it a namespace or pod name and it runs the full kubectl diagnostic sequence for you — events, logs, describe, resource usage — then explains what's actually wrong in plain English. No more chaining five commands to find out you're missing an environment variable.

House RecipeWork2 min

INGREDIENTS

🐙GitHub💬Slack

PROMPT

Create a skill called "Pod Doctor". When I give you a Kubernetes namespace and optionally a pod name, run this diagnostic sequence: 1. `kubectl get events --sort-by=.lastTimestamp -n <namespace>` 2. `kubectl get pods -n <namespace>` (or specific pod) 3. `kubectl describe pod <pod> -n <namespace>` for any unhealthy pods 4. `kubectl logs <pod> -n <namespace>` (and `--previous` if the pod restarted) 5. `kubectl top pod -n <namespace>` if metrics-server is available Correlate the findings and explain in plain English: - What's wrong (root cause) - Why it's happening (the mechanism) - How to fix it (exact YAML change or command) Common patterns to check: OOMKilled, CrashLoopBackOff, ImagePullBackOff, CreateContainerConfigError, pending pods (scheduling failures), probe failures, and resource quota exhaustion.

How It Works

Pod Doctor runs the diagnostic sequence that experienced K8s engineers do

from muscle memory: get events, describe pod, check logs (current and

previous), inspect resource usage, and verify configs. Then it correlates

everything and tells you the problem.

What You Get

  • Automated diagnostic sequence: events → describe → logs → resources → configs
  • Plain-English explanation of CrashLoopBackOff, OOMKilled, ImagePullBackOff, and other common failures
  • Root cause identification (missing secret, bad env var, resource limits, image not found, probe misconfiguration)
  • Suggested fixes with the exact YAML or command to run
  • Resource usage analysis (is the pod hitting its limits?)

Setup Steps

  1. Make sure kubectl is configured and has access to your cluster
  2. Tell your Claw the namespace and pod name (or just the namespace to scan all pods)
  3. Review the diagnosis and apply the suggested fix

Tips

  • Works best when your Claw has kubectl access to the cluster
  • Can scan an entire namespace for unhealthy pods in one pass
  • For intermittent issues, ask it to watch events over a time window
  • Pairs well with the K8s YAML Generator for producing corrected manifests
Tags:#kubernetes#debugging#devops#troubleshooting