CI Failure Triage Bot
Separate flakes from regressions and route the fix
Turn CI failures into categorized, actionable tickets by detecting flaky behavior, extracting failure fingerprints, and recommending next actions with ownership.
INGREDIENTS
PROMPT
Create a skill called "CI Failure Triage Bot". Given a CI job link or logs, the skill must: - Detect whether this failure is likely flaky (use rerun outcomes if available) - Extract a stable failure fingerprint (error lines, stack trace hash) - Classify into: flaky test, infra/network, dependency resolution, deterministic regression - Recommend next actions and propose an owner/team - Output a GitHub issue-ready summary (title, repro, evidence, next steps) Keep it concise but evidence-driven.
How It Works
This recipe classifies failures into: flaky tests, infra/network, dependency resolution,
or deterministic code regressions, then proposes fixes and ownership.
Triggers
- CI fails intermittently and reruns sometimes pass
- Teams routinely "just rerun CI"
- CI failures block PR merges for hours or days
Steps
- Collect the failing job, logs, and previous run history.
- Determine "flake likelihood":
- outcome differs after rerun without code change,
- failure signature matches known flaky patterns.
- Categorize:
- test flake,
- infra/network,
- dependency resolution,
- true regression.
- Produce an action plan:
- quarantine + root cause for flakes,
- caching/fetch retry for dependencies,
- code fix + regression test for true failures.
- Open (or update) a tracking issue with the failure fingerprint and ownership.
Expected Outcome
- CI failures stop being "mystery events" and become routable work.
- Reduced wasted time rerunning pipelines and reading logs manually.
Example Inputs
- "This PR fails on CI only sometimes; here's the job URL."
- "Nightly main pipeline shows 10 flaky tests."
- "Dependency download times out randomly."
Tips
- Rerun is a signal, not a solution: capture the delta and classify the cause.