Skip to main content

Error Handling

Exception hierarchy

ScrapingProsError              — base for all SDK errors
├── APIError — HTTP error (has .status_code, .detail)
│ ├── AuthenticationError — 401: invalid token
│ ├── CountryProxyNotApproved — 403: country proxy not approved
│ ├── URLBlockedError — 422: SSRF protection (private URLs)
│ ├── ValidationError — 422: invalid parameters
│ ├── RateLimitError — 429: rate limit (has .retry_after)
│ └── QuotaExceededError — 429: monthly credits exhausted
├── ConnectionError — network error (DNS, connection refused)
└── TimeoutError — polling timeout (run_and_wait)

Catching errors

from scrapingpros import (
ScrapingPros,
ScrapingProsError,
AuthenticationError,
RateLimitError,
QuotaExceededError,
URLBlockedError,
ConnectionError,
)

try:
result = client.scrape("https://example.com")
except AuthenticationError:
# Invalid token — get one at https://scrapingpros.com
pass
except RateLimitError as e:
# Rate limited — SDK auto-retries, this means all retries exhausted
print(f"Retry after {e.retry_after}s")
except QuotaExceededError:
# Monthly credits exhausted — upgrade plan
pass
except URLBlockedError:
# URL blocked by SSRF protection — only public URLs allowed
pass
except ConnectionError:
# Network error — DNS, connection refused, etc.
pass
except ScrapingProsError:
# Catch-all for any SDK error
pass

Auto-retry behavior

The SDK automatically retries on:

  • 429 rate limit — respects Retry-After header, exponential backoff with jitter
  • 5xx server errors — up to max_retries attempts (default: 3)

The SDK never retries on:

  • 429 quota exceeded — raises QuotaExceededError immediately
  • 4xx client errors — raises immediately (bad params, auth, etc.)

Response Guidance

Every scrape response includes a guidance object with structured error analysis:

result = client.scrape("https://hard-site.com")
g = result.guidance

# Always present
print(g.success) # True if usable content was retrieved
print(g.credits_charged) # 1 (simple), 5 (browser), or 0 (refunded)

When a scrape fails

if not result.guidance.success:
print(g.error_type) # "captcha", "ssl_error", "timeout", "login_wall", "blocked"
print(g.error_provider) # "cloudflare", "datadome", "recaptcha"
print(g.next_steps) # ["Try with browser=true", "Try retry_on_block=true"]
print(g.credits_refunded) # True if credits were returned

# stop_reason means retrying won't help
if g.stop_reason:
print("Give up:", g.stop_reason)

# suggested_request has ready-to-use params
elif g.suggested_request:
result = client.scrape(**g.suggested_request)

ScrapeGuidance fields

FieldTypeDescription
successboolWhether usable content was retrieved
error_typestrError category: captcha, ssl_error, timeout, login_wall, blocked
error_providerstrProtection provider: cloudflare, datadome, recaptcha
credits_chargedintCredits consumed (0 if refunded)
credits_refundedboolWhether credits were returned
next_stepslist[str]Human-readable suggestions
suggested_requestdictParams to retry with — pass to client.scrape(**suggested_request)
stop_reasonstrIf set, retrying is futile (login wall, geo-blocked)

Credit tracking

After every API call:

print(client.credits_charged)       # 1 or 5 (last call)
print(client.quota_remaining) # credits left this month
print(client.rate_limit_remaining) # requests left this minute