For anything over ~100 URLs, use the Batch API. scrape_many() is fine for small lists where you want a simple list-in / list-out contract — but it runs sync requests client-side with threads, which eats your sync rate limit and can't tolerate crashes.
Multiple URLs (scrape_many)
Use scrape_many() whenever you have a list of URLs. It runs requests concurrently and is much faster than a sequential loop.
Basic usage
urls = [
"https://example.com/page/1",
"https://example.com/page/2",
"https://example.com/page/3",
]
results = client.scrape_many(urls, format="markdown")
for url, result in zip(urls, results):
if isinstance(result, Exception):
print(f"FAILED: {url} — {result}")
else:
print(f"OK: {url} — {len(result.content or '')} chars")
scrape_many() never raises. Failed requests return the Exception object in the results list. Always check with isinstance(result, Exception).
Concurrency
By default, concurrency is auto-detected from your plan:
| Plan | Max concurrent |
|---|---|
| Free | 5 |
| Starter | 10 |
| Growth | 20 |
| Pro | 50 |
| Scale | 100 |
Override with max_concurrent:
# Force 10 concurrent requests
results = client.scrape_many(urls, max_concurrent=10, browser=True)
With actions, browser + blocking
All scrape() parameters apply to every URL — including actions:
from scrapingpros import WaitForSelectorAction, EvaluateAction, ClickAction
results = client.scrape_many(
urls,
format="markdown",
browser=True,
actions=[
WaitForSelectorAction(selector="css:.content", time=5000),
ClickAction(selector="css:.load-more", optional=True),
EvaluateAction(script="document.querySelectorAll('.item').length"),
],
headers={"Accept-Language": "en-US"},
block_resources=["image", "font", "media"],
block_requests=["google-analytics", "doubleclick"],
retry_on_block=True,
)
Performance
Benchmark on 50 easy/medium URLs with browser=True:
| Mode | Time | Speedup |
|---|---|---|
| Sequential loop | ~25 min | 1x |
scrape_many(max_concurrent=5) | ~5 min | 5x |
scrape_many(max_concurrent=25) | ~1 min | 25x |
scrape_many(max_concurrent=50) | ~30s | 50x |
Actual speedup depends on your plan's rate limit and the target sites' response times.
Async version
from scrapingpros import AsyncScrapingPros
async with AsyncScrapingPros("your_token") as client:
results = await client.scrape_many(urls, format="markdown", browser=True)
Heterogeneous requests
When each URL needs different parameters (different actions, headers, etc.), use requests=:
from scrapingpros import ScrapeRequest
results = client.scrape_many(requests=[
ScrapeRequest(url="https://site1.com", browser=True, actions=[...]),
ScrapeRequest(url="https://site2.com", format="markdown", extract={"title": "css:h1"}),
{"url": "https://site3.com", "browser": True}, # dicts work too
], max_concurrent=5)
When to use scrape_many vs collections
scrape_many() | Collections | |
|---|---|---|
| Best for | Up to ~500 URLs | 1,000+ URLs |
| Execution | Client-side concurrency | Server-side execution |
| Results | Immediate in memory | Poll or webhook |
| Parameters | Same or different per URL | Different params per URL |
| Error handling | Exception in results list | Failed jobs in run status |
For large batches with per-URL parameters, see Batch Processing.