JavaScript Execution (EvaluateAction)
Run arbitrary JavaScript in the browser page context. Use this to call internal APIs, read JS variables, or extract data that isn't in the DOM.
Basic usage
from scrapingpros import EvaluateAction, WaitForSelectorAction
result = client.scrape(
"https://example.com",
browser=True,
actions=[
WaitForSelectorAction(selector="css:body", time=5000),
EvaluateAction(script="document.title"),
],
)
title = result.evaluate_results[0]
Results are returned in evaluate_results — one item per EvaluateAction, in order.
Call internal APIs
Many sites load data from internal API endpoints. Use EvaluateAction to call them directly:
result = client.scrape(
"https://example.com/products",
browser=True,
actions=[
WaitForSelectorAction(selector="css:.product-list", time=5000),
EvaluateAction(script="fetch('/api/products').then(r => r.json())"),
],
)
products = result.evaluate_results[0] # parsed JSON
The fetch() runs inside the page context with the page's cookies and session. No authentication needed — it uses the same session as the browser.
Read JavaScript variables
Frameworks like Next.js, Nuxt, and others often embed page data in JS variables:
result = client.scrape(
"https://nextjs-app.com/product/123",
browser=True,
actions=[
EvaluateAction(script="window.__NEXT_DATA__"),
],
)
page_data = result.evaluate_results[0]
Multi-line scripts
result = client.scrape(
"https://example.com",
browser=True,
actions=[
EvaluateAction(script="""
const items = document.querySelectorAll('.item');
Array.from(items).map(el => ({
name: el.textContent.trim(),
href: el.querySelector('a')?.href
}))
"""),
],
)
items = result.evaluate_results[0]
Multiple evaluations
You can chain multiple EvaluateAction steps:
result = client.scrape(
"https://example.com",
browser=True,
actions=[
EvaluateAction(script="window.__NEXT_DATA__"),
EvaluateAction(script="document.title"),
EvaluateAction(script="fetch('/api/stats').then(r => r.json())"),
],
)
next_data = result.evaluate_results[0]
title = result.evaluate_results[1]
stats = result.evaluate_results[2]
Timeout
Default timeout is 30 seconds. Override for slow API calls:
EvaluateAction(
script="fetch('/api/heavy-report').then(r => r.json())",
timeout=60000, # 60 seconds
)
Browser automation actions
EvaluateAction is one of several browser actions. The full list:
| Action | Purpose |
|---|---|
ClickAction(selector) | Click an element |
InputAction(selector, text) | Type text into a form field |
SelectAction(selector, value) | Select a dropdown option |
KeyPressAction(key) | Press a keyboard key |
WaitForSelectorAction(selector, time, state=None) | Wait for an element to appear (state="visible" server default; pass "attached" for hidden <script> / non-rendered nodes, "hidden" to wait for disappearance) |
WaitForTimeoutAction(time) | Wait a fixed number of milliseconds |
EvaluateAction(script) | Execute JavaScript |
CollectAction(extract) | Extract data during a loop |
WhileControl(condition, actions) | Loop while a condition is met |
Wait for hidden DOM nodes (<script> tags, etc.)
The default wait state is "visible" — the server only considers the selector matched when the element is rendered (non-zero size, display not none). For <script> tags carrying embedded JSON like __NEXT_DATA__, that wait will time out (script tags are never visible). Pass state="attached":
from scrapingpros import WaitForSelectorAction
result = client.scrape(
"https://example.com/product/123",
browser=True,
actions=[
WaitForSelectorAction(
selector="css:script#__NEXT_DATA__",
time=8000,
state="attached", # match as soon as the node exists in the DOM
),
],
)
Available since v0.5.0. Accepted values: "visible", "attached", "hidden". Leave state=None (default) for visible.
Capture response bodies (auth tokens, GraphQL payloads)
network_capture records request metadata (URL, method, status) for every browser request. To also include the response body for selected URLs, set url_pattern to a glob. Useful for grabbing OAuth/Firebase tokens or GraphQL responses without re-running the request yourself:
from scrapingpros import NetworkCaptureConfig
result = client.scrape(
"https://app.example.com/dashboard",
browser=True,
network_capture=NetworkCaptureConfig(
resource_types=["xhr", "fetch"],
url_pattern="*identitytoolkit.googleapis.com*",
),
)
for entry in result.network_requests or []:
if "body" in entry: # only present when url_pattern matched
token = parse_token(entry["body"])
Bodies are capped at 64 KB server-side; larger responses are flagged with body_truncated: true. If the body cannot be captured (e.g. body fetch took more than 5 s, or the page navigated away), the entry gets a body_error field instead of body — the scrape itself never hangs. Single pattern only; for multiple endpoints, use a broader glob (*api*) and filter client-side.
Available since v0.5.0.
Example: login then scrape
from scrapingpros import ClickAction, InputAction, WaitForSelectorAction
result = client.scrape(
"https://example.com/login",
browser=True,
actions=[
InputAction(selector="css:#email", text="user@example.com"),
InputAction(selector="css:#password", text="secret"),
ClickAction(selector="css:button[type=submit]", wait_for_navigation=True),
WaitForSelectorAction(selector="css:.dashboard", time=5000),
],
)