Back to Cookbook
KiloClaw

CI Failure Triage Bot

Separate flakes from regressions and route the fix

Turn CI failures into categorized, actionable tickets by detecting flaky behavior, extracting failure fingerprints, and recommending next actions with ownership.

CommunitySubmitted by CommunityWork10 min

INGREDIENTS

🐙GitHub

PROMPT

Create a skill called "CI Failure Triage Bot". Given a CI job link or logs, the skill must: - Detect whether this failure is likely flaky (use rerun outcomes if available) - Extract a stable failure fingerprint (error lines, stack trace hash) - Classify into: flaky test, infra/network, dependency resolution, deterministic regression - Recommend next actions and propose an owner/team - Output a GitHub issue-ready summary (title, repro, evidence, next steps) Keep it concise but evidence-driven.

How It Works

This recipe classifies failures into: flaky tests, infra/network, dependency resolution,

or deterministic code regressions, then proposes fixes and ownership.

Triggers

  • CI fails intermittently and reruns sometimes pass
  • Teams routinely "just rerun CI"
  • CI failures block PR merges for hours or days

Steps

  1. Collect the failing job, logs, and previous run history.
  2. Determine "flake likelihood":
  • outcome differs after rerun without code change,
  • failure signature matches known flaky patterns.
  1. Categorize:
  • test flake,
  • infra/network,
  • dependency resolution,
  • true regression.
  1. Produce an action plan:
  • quarantine + root cause for flakes,
  • caching/fetch retry for dependencies,
  • code fix + regression test for true failures.
  1. Open (or update) a tracking issue with the failure fingerprint and ownership.

Expected Outcome

  • CI failures stop being "mystery events" and become routable work.
  • Reduced wasted time rerunning pipelines and reading logs manually.

Example Inputs

  • "This PR fails on CI only sometimes; here's the job URL."
  • "Nightly main pipeline shows 10 flaky tests."
  • "Dependency download times out randomly."

Tips

  • Rerun is a signal, not a solution: capture the delta and classify the cause.
Tags:#ci-cd#flaky-tests#build-failures#developer-productivity