Skip to main content

How to find

  1. In the left-hand menu click Scrapers
  2. Select Scraper Studio
cp-location.png

How it works

Every scraper in Scraper Studio performs two core operations regardless of how it’s created:
  • Interaction: Navigating to a target URL, handling pagination, clicking elements, or sending HTTP requests.
  • Parsing: Reading the page content (HTML or JSON) and extracting structured fields into a defined output schema.

Development Modes

Scraper Studio offers two ways to build that logic:
ModeHow it worksBest for
AI AgentDescribe the data you want in plain language. The AI generates a schema, then writes the scraper code for you.No-code users, fast prototyping
IDEWrite and test JavaScript directly in a browser-based code editor with debugging tools built in.Developers who need full control
Both modes produce the same type of scraper. A scraper built by the AI Agent can be opened and edited in the IDE at any time - and vice versa, any scraper can be updated using the Self-Healing Tool, which refactors code automatically from a plain-language prompt (e.g., “Add a price field to the output”).

When to use Scraper Studio vs. other Bright Data products

ScenarioRecommended product
Need data from a popular site with zero setupDatasets Marketplace
Need a scraper built and fully maintained by Bright DataManaged Services
Need to build a custom scraper yourself using AI or code, and run it on Bright Data’s infrastructureScraper Studio
Use Scraper Studio when the data you need isn’t in the Marketplace, you want ownership of the scraper logic, and you don’t want to manage proxies or infrastructure yourself.

Key trade-offs: AI Agent vs. IDE

AI AgentIDE
Setup timeMinutes - describe and runLonger - write, test, debug
Code controlLimited - AI writes the codeFull - you own every line
CustomizationVia Self-Healing Tool promptsDirect JavaScript editing
Best forFast scraper creation, non-technical usersComplex logic, multi-stage scrapers, performance tuning

Cloud infrastructure trade-offs

Scraper Studio runs entirely on Bright Data’s infrastructure. This means:
  • No server setup or proxy management - all included.
  • Billing is based on page loads (CPM) - 1 CPM = 1,000 page loads. Browser Workers typically consume more page loads than Code Workers for equivalent tasks.
  • Parallel job limit - up to 1,000 batch jobs can run simultaneously. Additional jobs queue automatically.
  • Snapshot retention - batch collection results are stored for 16 days; real-time results for 7 days. Export your data before expiry.

Next steps