Error Handling
Exception hierarchy
ScrapingProsError — base for all SDK errors
├── APIError — HTTP error (has .status_code, .detail)
│ ├── AuthenticationError — 401: invalid token
│ ├── CountryProxyNotApproved — 403: country proxy not approved
│ ├── URLBlockedError — 422: SSRF protection (private URLs)
│ ├── ValidationError — 422: invalid parameters
│ ├── RateLimitError — 429: rate limit (has .retry_after)
│ └── QuotaExceededError — 429: monthly credits exhausted
├── ConnectionError — network error (DNS, connection refused)
└── TimeoutError — polling timeout (run_and_wait)
Catching errors
from scrapingpros import (
ScrapingPros,
ScrapingProsError,
AuthenticationError,
RateLimitError,
QuotaExceededError,
URLBlockedError,
ConnectionError,
)
try:
result = client.scrape("https://example.com")
except AuthenticationError:
# Invalid token — get one at https://scrapingpros.com
pass
except RateLimitError as e:
# Rate limited — SDK auto-retries, this means all retries exhausted
print(f"Retry after {e.retry_after}s")
except QuotaExceededError:
# Monthly credits exhausted — upgrade plan
pass
except URLBlockedError:
# URL blocked by SSRF protection — only public URLs allowed
pass
except ConnectionError:
# Network error — DNS, connection refused, etc.
pass
except ScrapingProsError:
# Catch-all for any SDK error
pass
Auto-retry behavior
The SDK automatically retries on:
- 429 rate limit — respects
Retry-Afterheader, exponential backoff with jitter - 5xx server errors — up to
max_retriesattempts (default: 3)
The SDK never retries on:
- 429 quota exceeded — raises
QuotaExceededErrorimmediately - 4xx client errors — raises immediately (bad params, auth, etc.)
Response Guidance
Every scrape response includes a guidance object with structured error analysis:
result = client.scrape("https://hard-site.com")
g = result.guidance
# Always present
print(g.success) # True if usable content was retrieved
print(g.credits_charged) # 1 (simple), 5 (browser), or 0 (refunded)
When a scrape fails
if not result.guidance.success:
print(g.error_type) # "captcha", "ssl_error", "timeout", "login_wall", "blocked"
print(g.error_provider) # "cloudflare", "datadome", "recaptcha"
print(g.next_steps) # ["Try with browser=true", "Try retry_on_block=true"]
print(g.credits_refunded) # True if credits were returned
# stop_reason means retrying won't help
if g.stop_reason:
print("Give up:", g.stop_reason)
# suggested_request has ready-to-use params
elif g.suggested_request:
result = client.scrape(**g.suggested_request)
ScrapeGuidance fields
| Field | Type | Description |
|---|---|---|
success | bool | Whether usable content was retrieved |
error_type | str | Error category: captcha, ssl_error, timeout, login_wall, blocked |
error_provider | str | Protection provider: cloudflare, datadome, recaptcha |
credits_charged | int | Credits consumed (0 if refunded) |
credits_refunded | bool | Whether credits were returned |
next_steps | list[str] | Human-readable suggestions |
suggested_request | dict | Params to retry with — pass to client.scrape(**suggested_request) |
stop_reason | str | If set, retrying is futile (login wall, geo-blocked) |
Credit tracking
After every API call:
print(client.credits_charged) # 1 or 5 (last call)
print(client.quota_remaining) # credits left this month
print(client.rate_limit_remaining) # requests left this minute