KiloClawPowered by OpenClaw

Back to Cookbook

Flaky Root Cause Hunter

Find the real source of nondeterminism, not just the symptom

Diagnose flaky tests by categorizing nondeterminism (timing, order-dependence, shared state, network, resource contention) and applying targeted fixes.

CommunitySubmitted by CommunityWork12 min

Try in KiloClawFree 7-day trial

INGREDIENTS

🐙GitHub

PROMPT

Create a skill called "Flaky Root Cause Hunter". Given: - A flaky test name/file and recent failure logs - The test type (unit/integration/e2e) and environment (CI/local) Output: - A likely root-cause category and how to confirm it - Concrete fix patterns for that category - A verification plan (stress reruns, seed capture, isolation)

How It Works

This recipe structures flake debugging so teams stop inflating timeouts and start removing

nondeterminism from tests and environments.

Triggers

Quarantined flaky tests accumulate
Retries/timeouts are used as the primary fix
Failures are hard to reproduce locally

Steps

Re-run the single test repeatedly and record failure rate + modes.
Categorize:

race/timing,
order dependence,
shared global state,
network/IO instability,
resource contention.

Apply fixes:

hermetic test data,
explicit waits and deterministic clocks,
isolate shared state,
remove external network dependencies or mock safely.

Add instrumentation to tests (timestamps, retries count, random seeds).
Confirm fix by stress reruns and remove quarantine tag.

Expected Outcome

Reduced flaky rate and restored trust in CI.
Less time spent rerunning, more time shipping.

Example Inputs

"This test fails 1/20 runs only on CI."
"E2E flake occurs when CI is slow."
"Order-dependent failures in a shared DB test suite."

Tips

If you can't explain why the test failed, you haven't fixed the flake yet.

Tags:#flaky-tests#testing#debugging#ci-cd

Related Recipes

Flaky Test Quarantine

Stop flaky tests from blocking PRs while you fix them

Quarantine known flaky tests into a separate lane so they don't block merges, while preserving visibility and enforcing a fix-by deadline.

E2E Critical Path Suite

Keep end-to-end confidence without the unmaintainable monster

Establish a minimal set of end-to-end tests for critical journeys, backed by contract tests and targeted integration tests, to balance confidence with speed.

Creative Refresh Sprint Planner

Ship new creatives every week without chaos

Creative teams burn out when "we need new ads" arrives as an emergency. This recipe creates a weekly sprint system: inputs, brief template, production checklist, QA, and a testing plan that compounds learning.

Creative10 min setup

Hook Lab

Improve retention with scroll-stopping openings

Generates and rewrites hooks optimized for your platform and audience intent. It produces multiple hook styles (curiosity, contrast, payoff-first) and recommends which to test, based on your goals and content type.