Prerequisites
- Basic JavaScript familiarity (variables, functions, async control flow)
- An active Bright Data account
What are the two phases of a scraper?
A Bright Data Scraper Studio scraper runs in two phases per page:- Interaction moves through the site to reach the data. That means sending GET or POST requests, following links, handling pagination, submitting forms, and, on a Browser worker, clicking, typing, scrolling, and waiting for elements to appear.
- Parsing reads the page HTML (or a captured JSON payload) and returns a structured record.
parse() from interaction code once the target page has loaded. That runs the parser code and returns its result. Then call collect() to append the record to the final dataset:
How do I structure a multi-stage scraper?
Many scrapes need more than one hop, for example “visit a search page, then follow each result URL, then extract each product”. Bright Data Scraper Studio handles this with stages. Each stage is a separate browser session, andnext_stage({...}) queues a new input for the next stage.
The example below scrapes an ecommerce search across all result pages, following each listing to its detail page.
Stage 1, fan out search result pages:
- Stage 1 navigates to the search page and parses out the total number of pages.
- Stage 1 calls
next_stage({search_page})once per result page. Each call becomes a new stage-2 input. - Stage 2 navigates to each result page and parses out all listing URLs.
- Stage 2 calls
next_stage({listing_url})once per listing. Each call becomes a new stage-3 input. - Stage 3 navigates to each product page and calls
collect(parse())to add the record to the dataset.
next_stage() is much faster than walking pagination serially inside one stage.
Which worker type should I use?
Bright Data Scraper Studio offers two worker types:- Browser worker: a real headless browser. Needed when the page renders data with JavaScript, or when you need to click, scroll, type, or capture network traffic.
- Code worker: raw HTTP requests. Faster and cheaper, but cannot run JavaScript or interact with the page.
wait, click, scroll_*, tag_*, type, and more) will throw errors if you run them on a Code worker. See Worker types for the full list.
How does Scraper Studio handle blocking and CAPTCHAs?
Scraping at scale runs into the same defenses every time: IP blocks, rate limits, CAPTCHAs, fingerprinting, and bot detection. Bright Data Scraper Studio runs every request through Bright Data’s proxy infrastructure and Web Unlocker API, so the scraper:- Rotates through residential, ISP, datacenter, or mobile IPs based on scraper settings
- Retries blocked requests with a fresh peer session automatically
- Solves common CAPTCHAs when you call
solve_captcha() - Mimics real browser fingerprints on Browser worker
Related
Worker types
Choose between Browser worker and Code worker
Scraper Studio functions
Full reference for interaction and parser commands
Develop a scraper
Step-by-step walkthrough of building a scraper in the IDE
Best practices
Recommended patterns for fast, reliable scrapers