Skip to main content
This page answers the questions the Bright Data support team hears most often about Scraper Studio. If you need a walkthrough instead of a quick answer, start with Understanding Scraper Studio.

General

A Bright Data web scraper is an automated script that collects public web data at scale through Bright Data’s proxy and unblocking infrastructure. It returns the collected data in a structured format (JSON, NDJSON, CSV, XLSX, Parquet) and can deliver it to an API endpoint, webhook, cloud storage, or SFTP. Bright Data maintains hundreds of pre-built scrapers for popular sites in the Scrapers Library.
Bright Data Scraper Studio is a cloud-hosted environment for building custom scrapers. It offers two modes: an AI Agent that generates a scraper from a natural-language description, and an IDE where you write JavaScript directly. Both modes run on the same Bright Data proxy and unblocking infrastructure. See Understanding Scraper Studio.
The Scrapers Library contains pre-built scrapers Bright Data maintains for popular sites such as Amazon, LinkedIn, and Instagram. Bright Data Scraper Studio is the environment you use to build custom scrapers when the site you need is not in the library.
Yes. A single scraper can navigate to any URL you pass in as input. If you need different extraction logic per site, use multiple stages (next_stage()) or build one scraper per site.

Inputs, outputs, and schemas

An input is the parameter set Bright Data Scraper Studio passes into the scraper for a single run. Typical inputs are a URL, a search keyword, a product ID or ASIN, a profile handle, or a date range. Multiple inputs can be passed in one job via CSV upload or the API.
The output is the structured data the scraper returns for an input. Bright Data Scraper Studio delivers output in JSON, NDJSON, CSV, XLSX, or Parquet based on the scraper’s delivery preferences.
One input can generate multiple records. For example, if you submit 5 product listing URLs and each listing page contains 20 products, the scraper returns 100 records from 5 inputs. The statistics page counts records, not inputs.
A search scraper takes a keyword as input instead of a URL. Bright Data Scraper Studio runs a search on the target site and extracts data from the result pages. Use a search scraper when you do not have specific URLs to scrape.
A discovery scraper collects data from listing pages such as search results, category pages, or directories. It extracts fields that appear directly on the listing (titles, prices, ratings) and can also collect product URLs or IDs for a follow-up product-page scrape.
When the input or output schema changes, the scraper must be updated to match. If you trigger the scraper before Bright Data has updated it, you will see an input(output)_schema_incompatible error.To trigger the scraper anyway and ignore the schema mismatch, click Trigger anyway in the UI or add a parameter to your API request:
  • Output schema incompatible: override_incompatible_schema=1
  • Input schema incompatible: override_incompatible_input_schema=1
curl "https://api.brightdata.com/dca/trigger?scraper=ID_COLLECTOR&queue_next=1&override_incompatible_schema=1" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer API_KEY" \
  -d '[{"url":"https://targetwebsite.com/product_id/"}]'

Building and editing scrapers

You have three options in Bright Data Scraper Studio:
  • Build with the AI Agent: describe the data you want in plain language. See Scraper Studio AI Agent.
  • Build in the IDE: write JavaScript directly. See Develop a scraper with the IDE.
  • Request a managed scraper: Bright Data’s data team builds and maintains the scraper for you.
Yes. Every scraper, including ones the AI Agent generated, can be opened and edited in the Bright Data Scraper Studio IDE. You can change extraction logic, modify selectors, add or remove output fields, and tune performance. If you prefer not to write code, use the Self-Healing tool to request changes in plain language.
Pass the AI Agent a target URL and an optional description of the data you want. The agent asks clarifying questions, generates an output schema for your review, and writes the full scraper code once you approve the schema. You can then run the scraper immediately or schedule recurring runs. See Scraper Studio AI Agent for a full walkthrough.

Running and triggering scrapers

Bright Data Scraper Studio supports three trigger methods:
  • By API: regular request, queue request, or replace request
  • Manually: from the control panel
  • On a schedule: hourly, daily, weekly, or custom
See Initiate collection and delivery.
A queued request tells Bright Data Scraper Studio to wait until the previous request for the same scraper finishes before starting the next one. Use it when you want serial execution instead of running multiple jobs in parallel.
Bright Data Scraper Studio runs up to 1,000 batch jobs in parallel per scraper. Additional jobs queue automatically and start as capacity frees up. See Scraper Studio specifications for full limits.
In the Bright Data Scraper Studio dashboard, click the Bug icon under Failed crawls to open the scraper in the IDE. Failed inputs appear in the Last errors tab with the exact error message and error code. Bright Data stores the last 1,000 errors per virtual job so you can re-run failed inputs and diagnose the issue.

Billing and limits

CPM stands for “cost per mille”, meaning 1,000 page loads. Bright Data Scraper Studio bills page loads in CPM units. Current rates are on the pricing page.
A billable event is any function that causes Bright Data Scraper Studio to load a page or perform a network request:
  • navigate()
  • request()
  • load_more()
  • Media file download (billed per GB, separate from CPM)
The free trial includes 100 records. A record is one row of output, not one page load, so the trial covers more than 100 page loads for scrapers that return multiple records per input.

Snapshots and data retention

Snapshot retention depends on the collection type:
  • Batch collections: 16 days
  • Real-time collections: 7 days
After that, snapshots are permanently deleted. Bright Data does not recover expired data. Download or export your data before the retention window closes, or configure the scraper to deliver results automatically via webhook, API download, or cloud storage.

Reporting issues

Open the scraper in the Bright Data Scraper Studio control panel and select Report an issue from the three-dots menu. Bright Data routes the ticket to a different team based on the issue type:
  • Data (missing fields, missing records, parsing errors): routed to the scraper engineer. Available only for managed scrapers.
  • Collection and delivery (incomplete delivery, slow scraper): routed to the support team.
  • Other (UI issue, product question): routed to the account manager.
Include the affected job ID, a description of the problem, and a screenshot or file if it helps show the issue.
Include:
  • The issue category (wrong data, missing records, delivery problem, IDE issue, other)
  • A description of the exact problem
  • The affected job ID
  • A screenshot or file that shows the issue, if possible
Bright Data opens a ticket automatically and the R&D team handles it.
You receive an email when a Bright Data engineer starts building the scraper and another email when it is ready. You can also track the status on the Scrapers dashboard.

Understanding Scraper Studio

How Bright Data Scraper Studio works and when to use it

Specifications

Infrastructure limits, billing, and data retention