Asynchronous Endpoints
The asynchronous endpoints allow you to group multiple requests into a collection and execute them in the background. Ideal for scraping large volumes of URLs.
POST /v1/async/collections
Creates a new request collection.
Request
curl -X POST \
'https://api.scrapingpros.com/v1/async/collections' \
-H 'Authorization: Bearer <API-KEY>' \
-H 'Content-Type: application/json' \
-d '{
"name": "My collection",
"requests": [
{
"url": "https://example.com",
"browser": true
},
{
"url": "https://example.org",
"use_proxy": "any"
}
]
}'
Body
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Name of the collection |
requests | array | No | List of requests. Same format as the /v1/sync/scrape body |
Response (201)
{
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"name": "My collection",
"message": "Collection created successfully."
}
GET /v1/async/collections
Lists all created collections.
Request
curl 'https://api.scrapingpros.com/v1/async/collections' \
-H 'Authorization: Bearer <API-KEY>'
Response (200)
[
{
"name": "My collection",
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"
},
{
"name": "Another collection",
"id": "11d6f8af-9a54-4b6c-b793-e12b77c86159"
}
]
GET /v1/async/collections/{collection_id}
Gets a specific collection by its ID.
Request
curl 'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9' \
-H 'Authorization: Bearer <API-KEY>'
Response (200)
{
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"name": "My collection"
}
PUT /v1/async/collections/{collection_id}
Updates an existing collection. Both the name and the request list can be modified. If a new request list is sent, it replaces the previous one.
Request
curl -X PUT \
'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9' \
-H 'Authorization: Bearer <API-KEY>' \
-H 'Content-Type: application/json' \
-d '{
"name": "Updated collection",
"requests": [
{
"url": "https://new-example.com",
"browser": true
}
]
}'
Response (200)
{
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"name": "Updated collection",
"message": "Collection updated successfully."
}
POST /v1/async/collections/{collection_id}/run
Executes all requests in a collection asynchronously. A collection can be executed multiple times.
Request
curl -X POST \
'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9/run' \
-H 'Authorization: Bearer <API-KEY>'
No body required.
Response (201)
{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "in_progress",
"total_requests": 2,
"success_requests": 0,
"failed_requests": 0,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"
}
GET /v1/async/collections/{collection_id}/runs/{run_id}
Queries the status and result of an execution. Call periodically until status is completed.
Request
curl 'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9/runs/9b64941a-4545-4c57-9174-c70e781d9192' \
-H 'Authorization: Bearer <API-KEY>'
Response -- in progress (200)
{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "in_progress",
"total_requests": 2,
"success_requests": 1,
"failed_requests": 0,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"
}
Response -- completed without errors (200)
{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "completed",
"total_requests": 2,
"success_requests": 2,
"failed_requests": 0,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"failed_jobs": []
}
Response -- completed with errors (200)
{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "completed",
"total_requests": 3,
"success_requests": 2,
"failed_requests": 1,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"failed_jobs": [
{
"job_id": "e3a1b2c4-...",
"url": "https://example.com/page-that-failed",
"status": "failed",
"error": "Connection timeout"
}
]
}
Response Fields
| Field | Type | Description |
|---|---|---|
run_id | string (UUID) | Unique identifier of the execution |
status | string | Status: in_progress or completed |
total_requests | integer | Total requests in the collection |
success_requests | integer | Successfully completed requests |
failed_requests | integer | Failed requests |
timeout_requests | integer | Requests that timed out |
collection_id | string (UUID) | ID of the executed collection |
failed_jobs | array | List of failed or timed-out jobs, with their URL and error reason |
Where are the scraping results?
The run endpoint returns the execution status and details of failed jobs, but does not return the HTML of successful requests. Individual results for each URL are not accessible once the run is finished.
Use asynchronous mode when you need to confirm that a set of URLs was processed, and synchronous mode (/v1/sync/scrape) when you need the resulting HTML of each request.
Example: polling until completion
import time, requests
BASE_URL = "https://api.scrapingpros.com"
HEADERS = {"Authorization": "Bearer <API-KEY>"}
COLLECTION_ID = "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"
# Start the run
run = requests.post(
f"{BASE_URL}/v1/async/collections/{COLLECTION_ID}/run",
headers=HEADERS
).json()
run_id = run["run_id"]
# Poll until completed
while True:
status = requests.get(
f"{BASE_URL}/v1/async/collections/{COLLECTION_ID}/runs/{run_id}",
headers=HEADERS
).json()
print(f"Status: {status['status']} — {status['success_requests']}/{status['total_requests']} successful")
if status["status"] == "completed":
break
time.sleep(5)
# Check failed jobs
if status["failed_jobs"]:
print("Failed jobs:")
for job in status["failed_jobs"]:
print(f" - {job['url']}: {job['error']}")