Back to Cookbook

API Plumber

Build reliable data extraction scripts from any API

Give it an API's documentation URL and describe what data you need. Get back a reliable extraction script with pagination, rate limiting, retry logic, authentication, and incremental loading — the boilerplate that takes hours to get right.

CommunitySubmitted by CommunityWork5 min setup

PROMPT

Create a skill called "API Plumber". When I give you an API (either a documentation URL or a description of the endpoints), generate a production- grade Python extraction script. The script must handle: (1) Authentication (API key, OAuth2 with token refresh, basic auth). (2) Pagination (detect and implement the correct scheme — cursor, offset, or link-header based). (3) Rate limiting (respect rate limit headers, implement exponential backoff with jitter). (4) Error handling (retry transient errors, save progress on failure, resume from checkpoint). (5) Incremental loading (track the last extracted timestamp/ID, only fetch new data on subsequent runs). (6) Output (configurable: CSV, JSON, Parquet, or direct INSERT to a database). Include logging, progress reporting, and a dry-run mode. Make the script configurable via environment variables for credentials.

How It Works

Building API extraction is tedious plumbing: pagination schemes differ, rate

limits require exponential backoff, auth tokens expire mid-pull, and every API

has its quirks. This skill generates extraction scripts that handle all the edge

cases from the start.

What You Get

  • A complete Python extraction script from an API description
  • Proper pagination handling (cursor-based, offset-based, link-header)
  • Rate limiting with exponential backoff and retry logic
  • Authentication token refresh during long-running pulls
  • Incremental loading (only fetch new/updated records)
  • Output to CSV, JSON, Parquet, or direct database loading
  • Error handling with partial progress save (resume from where it left off)

Setup Steps

  1. Ask your Claw to create an "API Plumber" skill with the prompt below
  2. Provide the API documentation URL or describe the endpoints
  3. Specify what data you need and where to put it
  4. Get back a production-ready extraction script

Tips

  • Always request incremental loading — full re-extraction is wasteful for most APIs
  • The resume-on-failure feature is critical for large extractions (millions of records)
  • Facebook/Meta Ads, Google Analytics, and Salesforce APIs have notoriously tricky pagination — the skill handles them
  • Schedule the script with your Claw for regular data pulls
Tags:#api#data-extraction#etl#automation