Functions marked with ⭐ are proprietary to Bright Data Scraper Studio and are not part of any upstream library. Functions listed under Browser-only functions throw an error when called from a Code worker.
How is Scraper Studio code organized?
A Bright Data Scraper Studio scraper uses two code types:| Code type | Role | Language and libraries |
|---|---|---|
| Interaction code | Navigates the target site: URL requests, clicks, scrolls, waits, and background traffic capture | JavaScript + Bright Data browser commands |
| Parser code | Extracts and structures data from the HTML returned by interaction code | JavaScript + Cheerio ($) |
parse() (which runs the parser) and collect() (which appends a record to the final dataset).
Interaction functions
Interaction functions run in the scraper’s main JavaScript context and drive the browser or HTTP client. Use them to navigate, wait for elements, interact with the page, capture network traffic, and hand off data to the parser.Global objects
| Name | Type | Description |
|---|---|---|
input | object | Input for the current stage, set by the trigger or by a previous next_stage()/run_stage()/rerun_stage() call. |
job | object | Metadata about the current job (for example job.created, the job start timestamp). |
location | object | Info about the current browser location. Field: href. |
parser | object | Values captured by tag_response, tag_script, and related tagging functions, available after wait_for_parser_value(). |
Navigation
navigate — Load a URL in the browser
Navigates the browser to a URL. A 404 status throws a dead_page error by default; override with allow_status.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url | string or URL | Yes | — | Target URL |
opt.wait_until | string | No | load | load, domcontentloaded, networkidle0, or networkidle2 |
opt.timeout | number | No | 30000 | Navigation timeout in milliseconds |
opt.referer | string | No | — | Referer header to send |
opt.allow_status | number[] | No | [] | HTTP status codes to accept without throwing |
opt.fingerprint | object | No | — | Override browser fingerprint (screen.width, screen.height) |
request — Make a direct HTTP request
Sends an HTTP request without using a browser. Use on Code worker, or on Browser worker when you want to bypass the browser.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
url | options | string or object | Yes | URL string, or an object with url, method, headers, body |
next_stage — Queue input for the next stage
Runs the next stage of the scraper in a new browser session with the given input.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
input | object | Yes | Input object passed to the next stage |
run_stage — Run a specific stage
Runs a named stage of the scraper in a new browser session.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
stage | number | Yes | Stage index (starts at 1) |
input | object | Yes | Input object passed to that stage |
rerun_stage — Re-run the current stage with new input
Runs this stage again with a new input. Use it to fan out work (for example, one re-run per page in a pagination).
load_sitemap — Read URLs from an XML sitemap
Loads a sitemap XML file and returns the URL list. Supports sitemap indexes and gzip-compressed sitemaps.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
options.url | string | Yes | Sitemap URL |
resolve_url — Follow a URL through redirects
Returns the final URL that the given URL argument leads to.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string or URL | Yes | URL to resolve |
redirect_history — Get the redirect chain
Returns the history of URL redirects since the last navigate() call.
response_headers — Read the last response headers
Returns the response headers from the last page load.
status_code — Read the last response status
Returns the HTTP status code of the last page load.
Waiting on the page ⭐
All wait functions are Browser worker only.⭐ wait — Wait for an element to appear
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
selector | string | Yes | — | CSS selector to wait for |
opt.timeout | number | No | 30000 | Timeout in milliseconds |
opt.hidden | boolean | No | false | Wait for the element to be hidden instead of visible |
opt.inside | string | No | — | Selector of an iframe to look inside |
⭐ wait_any — Wait for any of several conditions
Waits for any matching condition to succeed. Returns when the first selector resolves.
⭐ wait_visible — Wait for an element to be visible
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
selector | string | Yes | — | CSS selector |
opt.timeout | number | No | 30000 | Timeout in milliseconds |
⭐ wait_hidden — Wait for an element to disappear
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
selector | string | Yes | — | CSS selector |
opt.timeout | number | No | 30000 | Timeout in milliseconds |
⭐ wait_for_text — Wait for text content
Waits for an element on the page to contain the given text.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
selector | string | Yes | CSS selector |
text | string | Yes | Text to wait for |
wait_for_parser_value — Wait for a parser field to be populated
Use after tag_response() or tag_script() to wait until the captured data is available.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field | string | Yes | Parser field path to wait on |
validate_fn | function | No | Optional callback returning true when the value is valid |
opt.timeout | number | No | Timeout in milliseconds |
⭐ wait_network_idle — Wait until the browser network settles
Waits until the browser network has been idle for a given period.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
opt.timeout | number | No | 500 | Milliseconds of idleness required |
opt.ignore | array | No | [] | Patterns (string or RegExp) for requests to exclude |
⭐ wait_page_idle — Wait until DOM mutations stop
Waits until no changes are made to the DOM tree for a given period.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
opt.idle_timeout | number | No | Milliseconds of stability required |
opt.ignore | array | No | Selectors to exclude from mutation monitoring |
Element interaction ⭐
All interaction functions require Browser worker.⭐ click — Click an element
Clicks an element, waiting for it to appear first.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
selector | string or array | Yes | CSS selector or Shadow DOM selector path |
opt.coordinates | {x, y} | No | Click the closest match to given page coordinates |
⭐ right_click — Right-click an element
Same as click but uses the right mouse button.
⭐ hover — Hover over an element
Moves the cursor over an element, waiting for it to appear first.
⭐ mouse_to — Move the cursor to a coordinate
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
x | number | Yes | Target X position |
y | number | Yes | Target Y position |
⭐ type — Enter text into an input
Waits for the input to appear, then types the given text.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
selector | string | Yes | CSS selector |
text | string or array | Yes | Text to type, or an array of strings and special keys |
opt.replace | boolean | No | Clear existing text before typing |
⭐ press_key — Press a special key
Types special keys like Enter or Backspace in the currently focused input.
⭐ select — Pick a value from a select element
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
selector | string | Yes | CSS selector of a <select> element |
value | string | Yes | Option value or visible text |
⭐ scroll_to — Scroll an element into view
Scrolls the page so a target element is visible. Defaults to natural scrolling; pass immediate: true to jump.
⭐ scroll_to_all — Scroll through every matching element
⭐ load_more — Trigger lazy-loaded content
Scrolls to the bottom of a list to trigger infinite-scroll loading.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
selector | string | Yes | Container element holding the lazy-loaded items |
opt.children | string | No | Selector for the individual items |
opt.trigger_selector | string | No | Selector for an explicit “load more” button |
opt.timeout | number | No | Timeout in milliseconds |
⭐ close_popup — Auto-close popups in the background
Registers a background watcher that closes a popup whenever it appears. See Best practices for the recommended pattern.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
popup_selector | string | Yes | Selector for the popup container |
close_selector | string | Yes | Selector for the element that closes it |
opt.click_inside | string | No | Parent iframe selector, if the close button is inside an iframe |
⭐ solve_captcha — Solve CAPTCHAs on the page
⭐ bounding_box — Get an element’s page coordinates
Returns the page-relative bounding box of the first matched element.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
selector | string | Yes | CSS selector |
el_exists — Check if an element is on the page
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
selector | string | Yes | — | CSS selector |
timeout | number | No | 0 | Wait up to N milliseconds for the element |
el_is_visible — Check if an element is visible
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
selector | string | Yes | — | CSS selector |
timeout | number | No | 0 | Wait up to N milliseconds for visibility |
⭐ track_event_listeners — Start tracking browser event listeners
Must be called before disable_event_listeners().
⭐ disable_event_listeners — Disable event listeners
Stops all event listeners from running on the page.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
event_types | string[] | No | Specific event types to disable |
⭐ freeze_page — Stop further page changes
Forces the page to stop changing, so HTML snapshots reflect exactly what the scraper saw. Experimental.
Network and response tagging ⭐
Tagging captures background network traffic and exposes it to the parser. Alltag_* functions are Browser worker only.
⭐ tag_response — Save one matching response
Saves the response data from a matching browser request.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field | string | Yes | Name of the parser field to populate |
pattern | RegExp or function | Yes | URL pattern or match function |
opt.jsonp | boolean | No | Parse JSONP response bodies (auto-detected when possible) |
opt.allow_error | boolean | No | Capture responses with non-2xx status codes |
⭐ tag_all_responses — Save every matching response
Saves the response data from every matching request as an array.
⭐ tag_script — Extract JSON embedded in a <script> tag
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field | string | Yes | Parser field name |
selector | string | Yes | Script tag selector |
⭐ tag_window_field — Tag a value on the browser window
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field | string | Yes | Parser field name |
key | string | Yes | window property to read |
⭐ tag_image — Capture an image URL from a DOM element
⭐ tag_video — Capture a video URL from a DOM element
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field | string | Yes | Parser field name |
selector | string | Yes | Element selector |
opt.download | boolean | No | Download the video file |
⭐ tag_screenshot — Save a screenshot of the page
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field | string | Yes | Parser field name |
opt.filename | string | No | Output filename |
opt.full_page | boolean | No | Defaults to true |
⭐ tag_download — Capture files downloaded by the browser
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string or RegExp | Yes | Pattern to match download requests |
⭐ tag_serp — Parse the page as a search engine result page
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field | string | Yes | Parser field name |
type | string | Yes | Parser type: google, bing, etc. |
⭐ capture_graphql — Capture and replay GraphQL queries
Captures a GraphQL request so you can replay it with different variables.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
options.payload | object | Yes | Key-value pairs that match the target request payload |
options.url | RegExp | No | URL pattern for the GraphQL endpoint (defaults to /graphql/) |
Data collection
parse — Run the parser code
Runs the parser code and returns the structured result.
collect — Append a record to the dataset
Adds one record to the scraper’s output.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
data_line | object | Yes | Fields to collect |
validate_fn | function | No | Callback that throws on invalid data |
set_lines — Set output lines, overriding previous calls
Each call to set_lines() overrides the previous one. Useful when the scraper collects partial data and you want the last known state delivered if a later step throws.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
lines | object[] | Yes | Array of records |
validate_fn | function | No | Validation callback, run once per line |
load_html — Load an HTML string into Cheerio
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
html | string | Yes | HTML to parse |
Marking a crawl as a failure
bad_input — Mark the input as invalid
Prevents any retries and reports error_code=bad_input.
blocked — Mark the page as blocked
Reports that the site refused access. error_code=blocked.
dead_page — Mark a URL as a dead link
Flags the page so it can be filtered from future collections. error_code=dead_page.
⭐ detect_block — Detect blocking conditions on the page
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
resource.selector | string | Yes | Element to check |
condition.exists | boolean | No | Fail if the element exists |
condition.has_text | string or RegExp | No | Fail if the element contains matching text |
Session and routing
country — Route through a specific country
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
code | string | Yes | Two-character ISO country code |
⭐ proxy_location — Fine-grained proxy location
Prefer country() unless you need precise geographic control.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
configuration.country | string | No | Two-character ISO country code |
configuration.lat | number | No | Latitude, range [-85, 85] |
configuration.long | number | No | Longitude, range [-180, 180] |
configuration.radius | number | No | Radius in kilometers |
preserve_proxy_session — Reuse the proxy session across child stages
set_session_cookie — Set a cookie for the current session
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
domain | string | Yes | Cookie domain |
name | string | Yes | Cookie name |
value | string | Yes | Cookie value |
set_session_headers — Set extra HTTP headers
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
headers | object | Yes | Header key-value pairs |
Browser configuration ⭐
Browser worker only.⭐ browser_size — Get the current browser window size
Returns {width, height} in pixels.
⭐ emulate_device — Emulate a mobile device
Switches the user agent, screen resolution, and device pixel ratio to match a named device.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
device | string | Yes | Device name, e.g. iPhone X, Pixel 2 |
Full list of supported device names
Full list of supported device names
- Blackberry PlayBook / landscape
- BlackBerry Z30 / landscape
- Galaxy Note 3 / landscape
- Galaxy Note II / landscape
- Galaxy S III / S5 / S8 / S9+ (each with landscape)
- Galaxy Tab S4 / landscape
- iPad / iPad Mini / iPad Pro / iPad Pro 11 / iPad (gen 6) / iPad (gen 7) (each with landscape)
- iPhone 4, 5, 6, 6 Plus, 7, 7 Plus, 8, 8 Plus, SE, X, XR, 11, 11 Pro, 11 Pro Max, 12 / 12 Mini / 12 Pro / 12 Pro Max, 13 / 13 Mini / 13 Pro / 13 Pro Max (each with landscape)
- JioPhone 2 / landscape
- Kindle Fire HDX / landscape
- LG Optimus L70 / landscape
- Microsoft Lumia 550, 950 (950 with landscape)
- Nexus 4, 5, 5X, 6, 6P, 7, 10 (each with landscape)
- Nokia Lumia 520 / landscape, Nokia N9 / landscape
- Pixel 2, 2 XL, 3, 4, 4a (5G), 5 (each with landscape)
- Moto G4 / landscape
⭐ font_exists — Check browser font support
Asserts that the browser can render the given font family.
⭐ html_capture_options — Configure HTML capture
Controls how the HTML snapshot is captured.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
options.coordinate_attributes | boolean | No | Embed element coordinates as attributes |
embed_html_comment — Inject a comment into the page HTML
Embeds metadata inside HTML snapshots.
Debugging and observability
console — Log from interaction code
⭐ verify_requests — Monitor failed browser requests
Fires a callback on every failed browser request.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
callback | function | Yes | Called with {url, error, type, response} for each failed request |
Value constructors
Bright Data Scraper Studio provides typed constructors for structured output fields.Image, Video, Money
| Constructor | Arguments | Use |
|---|---|---|
Image(src) | src: image URL or data URI | Collect image data |
Video(src) | src: video URL | Collect video data |
Money(value, currency) | value: number, currency: ISO code | Collect monetary values |
URL
Standard Node.js URL class.
Parser functions
Parser code runs after interaction code callsparse(). It receives the captured HTML and any tagged data, and returns a single record (or array of records) to the interaction code. Parser code uses Cheerio, a jQuery-compatible HTML parser.
Globals available in parser code
| Name | Type | Description |
|---|---|---|
$ | Cheerio instance | Loaded with the page HTML |
input | object | Current stage input |
location | object | Current browser location; field: href |
parser | object | Values tagged during interaction (from tag_response, tag_script, and related) |
Cheerio helpers
Bright Data Scraper Studio adds custom Cheerio methods on top of the standard API.$(selector).text_sane() — Normalize whitespace
Returns text() with all whitespace runs collapsed to a single space and trimmed.
$(selector).filter_includes(text) — Filter elements by text content
Filters a selection to elements whose text includes the given substring. Chainable with the rest of the Cheerio API.
Parser value constructors
Image, Video, and Money are also available in parser code and work the same way.
Shadow DOM support
Interaction commands that accept a selector also accept an array of selectors, letting you reach into Shadow DOM trees. Use this withclick, wait, type, and other interaction functions.
When you pass an array:
- One selector must target the shadow host element
- Every selector after it is resolved inside that shadow root
my-shadow-host is the element with the shadow root attached, and button.submit is resolved inside that shadow root.
Browser-only functions
The following functions require Browser worker and thrownot_supported_in_code_worker when called from a Code worker. Use this list to decide which worker your scraper needs.
| Category | Functions |
|---|---|
| Waits | wait, wait_any, wait_for_text, wait_visible, wait_hidden, wait_network_idle, wait_page_idle |
| Interaction | click, right_click, hover, mouse_to, type, press_key, select, scroll_to, scroll_to_all, load_more, close_popup, solve_captcha |
| Tagging | tag_response, tag_all_responses, tag_script, tag_window_field, tag_image, tag_video, tag_screenshot, tag_download, tag_serp, capture_graphql |
| Browser config | browser_size, emulate_device, font_exists, html_capture_options, freeze_page, track_event_listeners, disable_event_listeners |
Related
Best practices
Recommended patterns for writing fast, reliable scrapers
Worker types
When to use Browser worker vs Code worker
Basics of web scraping
Core concepts: interaction, parsing, stages, and scale
Develop a scraper
Step-by-step walkthrough of building a scraper in the IDE