How do I detect dead pages reliably?
When usingnavigate(), add a dead_page() condition so the scraper does not retry pages that do not exist. Bright Data Scraper Studio automatically marks HTTP 404 responses as dead, but many sites return 200 with a “not found” template, so you must check for that yourself.
Do not wrap wait() in a try/catch and call dead_page() from the catch block. A thrown wait() only tells you the selector did not appear within the timeout, not that the page is actually dead.
How do I minimize requests to the browser?
Interaction commands likeclick, type, el_exists, el_is_visible, wait, and wait_visible each send a request to the browser. Combine selectors into a single call instead of chaining several calls.
How do I paginate without blocking parallelization?
When a site has paginated results and you want data from every page, callrerun_stage() once from the root page for every page you need. Do not call rerun_stage() from inside each page as you walk the pagination: that serializes the work and Bright Data Scraper Studio cannot parallelize the requests.
How do I close popups without waiting for them?
Useclose_popup('popup_selector', 'close_button_selector') to register a background watcher that closes popups whenever they appear. Do not poll for a popup with wait_visible() before each interaction: popups can appear at any time, and explicit waits add latency on every step.
How do I wait for a tagged response before parsing?
When you usetag_response() to capture a background API call, follow it with wait_for_parser_value() to make sure the request has finished before you read parser. Without the wait, the parser may run before the response has arrived and parser.<field> will be undefined.
Should I throw custom error messages?
No. Let built-in errors from Bright Data Scraper Studio bubble up. They include the selector, the timeout, and the stage, which is more useful than a hand-written “Page not loaded properly”. Only throw a custom error when you are checking a domain-specific condition that the platform cannot detect on its own, such as a missing product title.How do I handle slow websites without over-extending timeouts?
Keep the default 30-second timeout for most waits. If a specific page is consistently slow, raise it to 45 or 60 seconds. Do not push beyond 60 seconds: a slower peer is usually the cause, and Bright Data Scraper Studio automatically retries with a fresh peer session when a page reports a timeout error.Should I build my own retry loop?
No. Bright Data Scraper Studio handles retries at the job level with a new peer session. A custom retry loop inside your scraper reuses the same session, which is the reason the first attempt failed. Report the error and let the platform retry.Should I wrap parser expressions in try/catch?
No. Use optional chaining (?.) and nullish coalescing (??) instead. A silent try/catch around a property access hides real bugs, and a try/catch around a wait() wastes browser time.
How do I extract values from a set of elements in parser code?
UsetoArray().map() instead of each(). It is shorter, returns a real array, and reads as a single expression.
How do I normalize text in parser code?
Call$(selector).text_sane(). Bright Data Scraper Studio adds this custom method to the Cheerio prototype: it collapses every run of whitespace to a single space and trims the result. For numeric extraction, strip non-digits with a regex.
Related
Scraper Studio functions
Full reference for Bright Data Scraper Studio interaction and parser commands
Worker types
Choose between Browser worker and Code worker for your scraper