Skip to main content

Introduction

Welcome to the Scraping Pros API documentation.

Illustration of a crawler scraping a website.

Features

  • Universal scraping — Retrieve data from any website, including those with CAPTCHAs, dynamic JavaScript, or blocking mechanisms.
  • Synchronous and asynchronous modes — Scrape synchronously (immediate result) or asynchronously (URL collections processed in the background).
  • Browser or plain HTTP — Use a headless browser for dynamic sites, or direct HTTP for maximum speed. Supports heavy browser (Camoufox, anti-detection) and lightweight browser (pure Playwright).
  • Markdown outputformat=markdown returns clean text without scripts, styles, or navigation. Ideal for AI/LLM consumption and RAG pipelines.
  • Automatic retries — Retry system with proxy rotation on failures.
  • Retry on blockretry_on_block=true automatically retries up to 3 times with a different IP/fingerprint when a CAPTCHA or 403 is detected. Only charges credits for the successful attempt.
  • Early CAPTCHA detection — If the site presents a CAPTCHA or block, the response is returned in ~5 seconds (instead of 60-85s). Applies to all plans automatically.
  • Smart proxies — Automatic proxy rotation with per-country support (200+ countries).
  • Webhookscallback_url on async collections to receive a signed POST notification (HMAC-SHA256) when a run completes. Includes run_id, status, counters, and job_ids.
  • Browser actions — Interact with the page: clicks, inputs, selects, key presses, waits, conditional loops, and JavaScript execution (evaluate).
  • Data extraction — Extract specific data with CSS/XPath selectors directly from the API, without needing to parse HTML.
  • JavaScript execution (evaluate) — Execute arbitrary JS code in the page context to access data, manipulate the DOM, or trigger AJAX forms.
  • Network capture (network_capture) — Capture XHR/fetch requests made by the page to discover internal APIs and data endpoints.
  • File download — Download PDFs, images, and other files, returning their content in base64.
  • Screenshots — Capture screenshots of scraped pages.
  • Custom headers and cookies — Send custom HTTP headers and cookies with requests.
  • Block detectionpotentiallyBlockedByCaptcha field that indicates whether the response appears to be a CAPTCHA or block page.
  • Feasibility test — Analyze URLs before scraping them to determine the recommended scraping strategy.
  • Credit system — 1 simple request = 1 credit, 1 browser request = 5 credits. Anti-bot and proxy included. No credits are charged for infrastructure errors.
  • Timings always present — Every response includes timings (even on errors) for performance diagnostics.
  • Metrics and billing — Per-client usage metrics and monthly billing endpoints with breakdown by domain and credits.
  • PlansGET /v1/plans (no auth) shows all plans with pricing, credits, and features.
  • MCP Server — Model Context Protocol server for AI agents (Claude, GPT, Cursor) with 6 tools and anti-injection protection.
  • Health check — Monitoring endpoint that verifies the status of all API components (no authentication required).

Authentication

All endpoints require an authentication token sent in the Authorization header:

Authorization: Bearer <API-KEY>

The exceptions are GET /v1/health, GET /v1/plans, GET /llms.txt, and GET / which do not require authentication.

Demo token (no registration required): demo_6x595maoA6GdOdVb — 5,000 credits/month, 30 req/min. All features enabled except per-country proxies.

Free plan: 1,000 credits/month (200 browser requests or 1,000 simple). For production use, contact the team for Starter plans ($29/month) and above.

For production use, contact the Scraping Pros team to get your API key with higher limits.