Skip to main content

Asynchronous Endpoints

The asynchronous endpoints allow you to group multiple requests into a collection and execute them in the background. Ideal for scraping large volumes of URLs.


POST /v1/async/collections

Creates a new request collection.

Request

curl -X POST \
'https://api.scrapingpros.com/v1/async/collections' \
-H 'Authorization: Bearer <API-KEY>' \
-H 'Content-Type: application/json' \
-d '{
"name": "My collection",
"requests": [
{
"url": "https://example.com",
"browser": true
},
{
"url": "https://example.org",
"use_proxy": "any"
}
]
}'

Body

FieldTypeRequiredDescription
namestringYesName of the collection
requestsarrayNoList of requests. Same format as the /v1/sync/scrape body

Response (201)

{
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"name": "My collection",
"message": "Collection created successfully."
}

GET /v1/async/collections

Lists all created collections.

Request

curl 'https://api.scrapingpros.com/v1/async/collections' \
-H 'Authorization: Bearer <API-KEY>'

Response (200)

[
{
"name": "My collection",
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"
},
{
"name": "Another collection",
"id": "11d6f8af-9a54-4b6c-b793-e12b77c86159"
}
]

GET /v1/async/collections/{collection_id}

Gets a specific collection by its ID.

Request

curl 'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9' \
-H 'Authorization: Bearer <API-KEY>'

Response (200)

{
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"name": "My collection"
}

PUT /v1/async/collections/{collection_id}

Updates an existing collection. Both the name and the request list can be modified. If a new request list is sent, it replaces the previous one.

Request

curl -X PUT \
'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9' \
-H 'Authorization: Bearer <API-KEY>' \
-H 'Content-Type: application/json' \
-d '{
"name": "Updated collection",
"requests": [
{
"url": "https://new-example.com",
"browser": true
}
]
}'

Response (200)

{
"id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"name": "Updated collection",
"message": "Collection updated successfully."
}

POST /v1/async/collections/{collection_id}/run

Executes all requests in a collection asynchronously. A collection can be executed multiple times.

Request

curl -X POST \
'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9/run' \
-H 'Authorization: Bearer <API-KEY>'

No body required.

Response (201)

{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "in_progress",
"total_requests": 2,
"success_requests": 0,
"failed_requests": 0,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"
}

GET /v1/async/collections/{collection_id}/runs/{run_id}

Queries the status and result of an execution. Call periodically until status is completed.

Request

curl 'https://api.scrapingpros.com/v1/async/collections/c38b0bcf-cb7c-4728-8704-2c2e267dcff9/runs/9b64941a-4545-4c57-9174-c70e781d9192' \
-H 'Authorization: Bearer <API-KEY>'

Response -- in progress (200)

{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "in_progress",
"total_requests": 2,
"success_requests": 1,
"failed_requests": 0,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"
}

Response -- completed without errors (200)

{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "completed",
"total_requests": 2,
"success_requests": 2,
"failed_requests": 0,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"failed_jobs": []
}

Response -- completed with errors (200)

{
"run_id": "9b64941a-4545-4c57-9174-c70e781d9192",
"status": "completed",
"total_requests": 3,
"success_requests": 2,
"failed_requests": 1,
"timeout_requests": 0,
"collection_id": "c38b0bcf-cb7c-4728-8704-2c2e267dcff9",
"failed_jobs": [
{
"job_id": "e3a1b2c4-...",
"url": "https://example.com/page-that-failed",
"status": "failed",
"error": "Connection timeout"
}
]
}

Response Fields

FieldTypeDescription
run_idstring (UUID)Unique identifier of the execution
statusstringStatus: in_progress or completed
total_requestsintegerTotal requests in the collection
success_requestsintegerSuccessfully completed requests
failed_requestsintegerFailed requests
timeout_requestsintegerRequests that timed out
collection_idstring (UUID)ID of the executed collection
failed_jobsarrayList of failed or timed-out jobs, with their URL and error reason

Where are the scraping results?

The run endpoint returns the execution status and details of failed jobs, but does not return the HTML of successful requests. Individual results for each URL are not accessible once the run is finished.

Use asynchronous mode when you need to confirm that a set of URLs was processed, and synchronous mode (/v1/sync/scrape) when you need the resulting HTML of each request.

Example: polling until completion

import time, requests

BASE_URL = "https://api.scrapingpros.com"
HEADERS = {"Authorization": "Bearer <API-KEY>"}
COLLECTION_ID = "c38b0bcf-cb7c-4728-8704-2c2e267dcff9"

# Start the run
run = requests.post(
f"{BASE_URL}/v1/async/collections/{COLLECTION_ID}/run",
headers=HEADERS
).json()

run_id = run["run_id"]

# Poll until completed
while True:
status = requests.get(
f"{BASE_URL}/v1/async/collections/{COLLECTION_ID}/runs/{run_id}",
headers=HEADERS
).json()

print(f"Status: {status['status']}{status['success_requests']}/{status['total_requests']} successful")

if status["status"] == "completed":
break

time.sleep(5)

# Check failed jobs
if status["failed_jobs"]:
print("Failed jobs:")
for job in status["failed_jobs"]:
print(f" - {job['url']}: {job['error']}")