Bright Data Docs home pagelight logodark logo
  • Support
  • Sign up
  • Sign up
Welcome
Proxy Infrastructure
Web Access APIs
Data Feeds
API Reference
General
Integrations

Web Scraper IDE FAQs

Find answers to common questions about Bright Data’s Web Scraper IDE, including setup, troubleshooting, and best practices for developing custom data scrapers.

What is a Bright Data Web Scraper?

Bright Data Web Scrapers are automated tools that enable businesses to automatically collect all types of public online data on a mass scale, while heavily reducing in-house expenses on proxy maintenance and development.

The Web Scraper delivers enormous amounts of raw data in a structured format, and integrates with existing systems, for immediate use in competitive data-driven decisions.

Bright Data has developed hundreds of Web Scrapers customized to popular platforms.

What is Web Scraper IDE?

Web Scraper IDE is an integrated development environment. The IDE is a Public web data on any scale at your fingertips, you can:

  • Build your scraper in minutes
  • Debug and diagnose with ease
  • Bring to production quickly
  • Browser scripting in simple Javascript

What is an “input” when using a Web Scraper?

When collecting data, your “input” are the parameters you’ll enter to run your collection with. This can include keywords, URL, search items, product ID, ASIN, profile name, check in and check out dates, etc.

What is an “output” when using a Web Scraper?

The output is the data that you’ve collected from a platform based on your input parameters. You’ll receive your data as JSON/NDJSON/CSV/XLSX.

How many free records are included with my free trial?

Each free trial includes 100 records (note: 100 records does not mean 100 page loads).

Why did I receive more statistic records than inputs?

You’ll always receive a higher number of records than the inputs you’ve requested.

What are the most frequent data points collected from social media?

Number of followers, average number of likes for posts, level of engagement, account theme, social and demographic portrait of the audience, social listening: keywords/ brand mentions, sentiments, viral trends.

Can I collect data from multiple platforms?

Yes, we can collect data from large numbers of websites at the same time.

Can I add additional information to my Web Scraper?

Yes you can, you can ask your account manager for help, or you can open a ticket related to the specific Web Scraper by selecting ‘Report an issue.’ Then request that fields be added or removed from your Web Scraper.

What is a search scraper?

In cases where you don’t know a specific URL, you can search for a term and get data based on that term.

What is a discovery scraper?

With a discovery scraper, you enter a URL(s) and collect all data from that page(s). You’ll receive data without having to specify a specific product or keyword.

Can I change the code in the IDE by myself?

Yes, the code is in JS, for self-managed scrapers you can change it according to your requirements.

What are the options to initiate requests?

We have 3 option to initiate requests:

  • Initiate by API - regular request, queue request and replace request.
  • Initiate manually.
  • Schedule mode

How to start using the Web Scraper?

There are two ways to use the data collection tool:

  • Develop a self-managed scraper on your own
  • Request a custom dataset

What is a queue request?

When you are sending more than one API request, a “queue request” means that you’d like your next request to start automatically after your first request is completed, and so on with all other requests.

What is a CPM?

CPM = 1000 page loads

When building a scraper, what is considered as a billable event?

Billable events:

  • navigate()
  • request()
  • load_more()
  • (later) media file download

How can I confirm that someone is working on the new Web Scraper I requested?

You’ll receive an email that the developer is working on your new Web Scraper, and you will be notified when your scraper is ready.

Status of the request can also be found on your dashboard :

How to report an issue on the Web scraper IDE?

You can use this form to communicate any issues you have with the platform, the scraper, or the dataset results.

Tickets will be assigned to a different department depending on selected issue type. Please make sure to choose the most relevant type.*

1

Select a job ID : issued Dataset

2

Select a type of the issue

Data

This option is only available for managed scrapers. Tickets will be sent direct to your scraper engineer.

  • Missing fields
  • Missing records
  • Missing values
  • Parsing issues: The dataset results are incorrect

Collection and Delivery

This type of tickets will be addressed to our support agents.

  • Incomplete delivery: Something went wrong during the delivery
  • Scraper is slow: The scraper is collecting results slowly or stuck

Other

This type of tickets will be addressed to your account manager.

  • UI issue : UI does not operate correctly
  • Product question: General questions regarding using Web Scraper product
  • Something else is going wrong
3

(Parsing issues) Use the “bug” red icon to indicate where the incorrect results are

4

(Parsing issues) Enter the results you expect to receive

5

Write a description of what went wrong and the URL where the data is collected

6

If needed, attach an image to support your report

I updated input/output schema of my managed scraper. Can I use it while BrightData updates my scraper?

When input/output schema is updated, the scraper needs to be updated to match new schema. If the scraper is in work and not updated yet, you’ll see ‘Incompatible input/output schema’ error.

Copy
{"error":{"code":"input(output)_schema_incompatible" ... }}

If you want to initiate it ignoring schema change, you can click ‘Trigger anyway’ on UI. API, you can add

  • output schema incompatible: override_incompatible_schema=1
  • input schema incompatible: override_incompatible_input_schema=1

parameter when triggering the scraper:

Copy
curl "https://api.brightdata.com/dca/trigger?scraper=ID_COLLECTOR&queue_next=1&override_incompatible_schema=1" -H "Content-Type: application/json" -H "Authorization: Bearer API_KEY" -d "[{\"url\":\"https://targetwebsite.com/product_id/\"}]"

How can I debug real time scrapers?

We store the last 1000 errors inside the virtual job record so you can see example inputs that were wrong (there’s a CP button to view the errors in the IDE).

The customer should already know which inputs were wrong because they got an ‘error’ response for them. You can re-run these manually in the IDE to see what happened. This is just like providing a CURL request example when the unblocker isn’t behaving right.

What should I do if I face an issue with a Web Scraper?

Select “Report an issue” from the Bright Data Control Panel. Once you report your issue, an automatic ticket will be assigned to one of our 14 developers that monitor all tickets on a dailybasis. Make sure to provide details of what the problem is, and if you are not sure, please contact your account manager. Once you report an issue, you don’t need to do anything else, and you’ll receive an email confirming that the issue was reported.

When “reporting an issue”, what information should I include in my report?

Please provide the following information when reporting an issue:

  • Select the type of problem you’re facing (for example: getting the wrong results/missing data points/the results never loaded/delivery issue/ UI issue/scraper is slow/IDE issue/other)
  • Please describe in detail the problem that you are facing
  • You may upload a file that describes the problem

After reporting an issue, we’ll automatically open a ticket that will be promptly handled by our R&D Department.

What is a Data Collector?

In the past, we referred to all of our scraping tools as “Collectors.” A Collector is essentially a web scraper that consists of both interaction code and parser code. It can operate as an HTTP request or in a real browser, with all requests routed through our unlocker network to prevent blocking.

Over time, we developed a Dataset Unit that builds on top of one or more Collectors. For example, with a single Collector (direct request), you can scrape a specific URL—such as a product page from an e-commerce site—and receive the parsed data. In more complex scenarios, multiple Collectors can work together, such as when discovering and scraping categories, followed by collecting data on every product within those categories.

How to Create a Data Collector?

You have a few options to create and configure a Data Collector:

  1. Using the Web Scraper IDE: You can design and structure your parser as individual Collectors or as a single Collector with multiple steps. To get started:
  • Click on the “Web Data Collection” icon on the right.
  • Navigate to the “My Scrapers” tab.
  • Select the “Develop a Web Scraper (IDE)” button.

From here, you can build from scratch or explore available templates for guidance. Start here: Create a Data Collector

  1. Requesting a Custom Dataset: If you prefer us to handle it, you can request a custom dataset, and we’ll create the Data Collectors needed to deliver it. To do this, click on the “Request Datasets” button under the “My Datasets” tab and choose the option that best suits your needs. Start here: Request a Custom Dataset

Any system limitations?

We have a limitation of 100 parallel-running jobs. When more than 100 jobs are triggered, the additional jobs are placed in a queue and wait until the earlier ones finish.  

How do i use the AI code generator?

Overview
The goal of this feature is to generate a custom code template tailored to your target website. Simply enter the target URL, and we’ll automatically generate a ready-to-use code template that you can edit or run as needed.

How does it work?
Simply enter your target URL and click “Generate Code.” Once the code is ready, it will appear in the IDE tab - no need to wait around while the AI processes your request. You’ll receive an email notification as soon as the code is ready.

Note: This feature works on PDP (Product Detail Page) URLs - when you already know your target URL, it generates parser code accordingly. It is not ideally suited for “Discovery” use cases, where the goal is to find those target URLs.

Assistant
Responses are generated using AI and may contain mistakes.

Was this page helpful?

linkedinyoutubegithub
Powered by Mintlify
  • Proxy Integrations
  • AI
  • Automation
  • Systems
  • Browser
  • Proxy Managers
  • Scraping tool
  • Testing

Web Scraper IDE FAQs

Find answers to common questions about Bright Data’s Web Scraper IDE, including setup, troubleshooting, and best practices for developing custom data scrapers.

What is a Bright Data Web Scraper?

Bright Data Web Scrapers are automated tools that enable businesses to automatically collect all types of public online data on a mass scale, while heavily reducing in-house expenses on proxy maintenance and development.

The Web Scraper delivers enormous amounts of raw data in a structured format, and integrates with existing systems, for immediate use in competitive data-driven decisions.

Bright Data has developed hundreds of Web Scrapers customized to popular platforms.

What is Web Scraper IDE?

Web Scraper IDE is an integrated development environment. The IDE is a Public web data on any scale at your fingertips, you can:

  • Build your scraper in minutes
  • Debug and diagnose with ease
  • Bring to production quickly
  • Browser scripting in simple Javascript

What is an “input” when using a Web Scraper?

When collecting data, your “input” are the parameters you’ll enter to run your collection with. This can include keywords, URL, search items, product ID, ASIN, profile name, check in and check out dates, etc.

What is an “output” when using a Web Scraper?

The output is the data that you’ve collected from a platform based on your input parameters. You’ll receive your data as JSON/NDJSON/CSV/XLSX.

How many free records are included with my free trial?

Each free trial includes 100 records (note: 100 records does not mean 100 page loads).

Why did I receive more statistic records than inputs?

You’ll always receive a higher number of records than the inputs you’ve requested.

What are the most frequent data points collected from social media?

Number of followers, average number of likes for posts, level of engagement, account theme, social and demographic portrait of the audience, social listening: keywords/ brand mentions, sentiments, viral trends.

Can I collect data from multiple platforms?

Yes, we can collect data from large numbers of websites at the same time.

Can I add additional information to my Web Scraper?

Yes you can, you can ask your account manager for help, or you can open a ticket related to the specific Web Scraper by selecting ‘Report an issue.’ Then request that fields be added or removed from your Web Scraper.

What is a search scraper?

In cases where you don’t know a specific URL, you can search for a term and get data based on that term.

What is a discovery scraper?

With a discovery scraper, you enter a URL(s) and collect all data from that page(s). You’ll receive data without having to specify a specific product or keyword.

Can I change the code in the IDE by myself?

Yes, the code is in JS, for self-managed scrapers you can change it according to your requirements.

What are the options to initiate requests?

We have 3 option to initiate requests:

  • Initiate by API - regular request, queue request and replace request.
  • Initiate manually.
  • Schedule mode

How to start using the Web Scraper?

There are two ways to use the data collection tool:

  • Develop a self-managed scraper on your own
  • Request a custom dataset

What is a queue request?

When you are sending more than one API request, a “queue request” means that you’d like your next request to start automatically after your first request is completed, and so on with all other requests.

What is a CPM?

CPM = 1000 page loads

When building a scraper, what is considered as a billable event?

Billable events:

  • navigate()
  • request()
  • load_more()
  • (later) media file download

How can I confirm that someone is working on the new Web Scraper I requested?

You’ll receive an email that the developer is working on your new Web Scraper, and you will be notified when your scraper is ready.

Status of the request can also be found on your dashboard :

How to report an issue on the Web scraper IDE?

You can use this form to communicate any issues you have with the platform, the scraper, or the dataset results.

Tickets will be assigned to a different department depending on selected issue type. Please make sure to choose the most relevant type.*

1

Select a job ID : issued Dataset

2

Select a type of the issue

Data

This option is only available for managed scrapers. Tickets will be sent direct to your scraper engineer.

  • Missing fields
  • Missing records
  • Missing values
  • Parsing issues: The dataset results are incorrect

Collection and Delivery

This type of tickets will be addressed to our support agents.

  • Incomplete delivery: Something went wrong during the delivery
  • Scraper is slow: The scraper is collecting results slowly or stuck

Other

This type of tickets will be addressed to your account manager.

  • UI issue : UI does not operate correctly
  • Product question: General questions regarding using Web Scraper product
  • Something else is going wrong
3

(Parsing issues) Use the “bug” red icon to indicate where the incorrect results are

4

(Parsing issues) Enter the results you expect to receive

5

Write a description of what went wrong and the URL where the data is collected

6

If needed, attach an image to support your report

I updated input/output schema of my managed scraper. Can I use it while BrightData updates my scraper?

When input/output schema is updated, the scraper needs to be updated to match new schema. If the scraper is in work and not updated yet, you’ll see ‘Incompatible input/output schema’ error.

Copy
{"error":{"code":"input(output)_schema_incompatible" ... }}

If you want to initiate it ignoring schema change, you can click ‘Trigger anyway’ on UI. API, you can add

  • output schema incompatible: override_incompatible_schema=1
  • input schema incompatible: override_incompatible_input_schema=1

parameter when triggering the scraper:

Copy
curl "https://api.brightdata.com/dca/trigger?scraper=ID_COLLECTOR&queue_next=1&override_incompatible_schema=1" -H "Content-Type: application/json" -H "Authorization: Bearer API_KEY" -d "[{\"url\":\"https://targetwebsite.com/product_id/\"}]"

How can I debug real time scrapers?

We store the last 1000 errors inside the virtual job record so you can see example inputs that were wrong (there’s a CP button to view the errors in the IDE).

The customer should already know which inputs were wrong because they got an ‘error’ response for them. You can re-run these manually in the IDE to see what happened. This is just like providing a CURL request example when the unblocker isn’t behaving right.

What should I do if I face an issue with a Web Scraper?

Select “Report an issue” from the Bright Data Control Panel. Once you report your issue, an automatic ticket will be assigned to one of our 14 developers that monitor all tickets on a dailybasis. Make sure to provide details of what the problem is, and if you are not sure, please contact your account manager. Once you report an issue, you don’t need to do anything else, and you’ll receive an email confirming that the issue was reported.

When “reporting an issue”, what information should I include in my report?

Please provide the following information when reporting an issue:

  • Select the type of problem you’re facing (for example: getting the wrong results/missing data points/the results never loaded/delivery issue/ UI issue/scraper is slow/IDE issue/other)
  • Please describe in detail the problem that you are facing
  • You may upload a file that describes the problem

After reporting an issue, we’ll automatically open a ticket that will be promptly handled by our R&D Department.

What is a Data Collector?

In the past, we referred to all of our scraping tools as “Collectors.” A Collector is essentially a web scraper that consists of both interaction code and parser code. It can operate as an HTTP request or in a real browser, with all requests routed through our unlocker network to prevent blocking.

Over time, we developed a Dataset Unit that builds on top of one or more Collectors. For example, with a single Collector (direct request), you can scrape a specific URL—such as a product page from an e-commerce site—and receive the parsed data. In more complex scenarios, multiple Collectors can work together, such as when discovering and scraping categories, followed by collecting data on every product within those categories.

How to Create a Data Collector?

You have a few options to create and configure a Data Collector:

  1. Using the Web Scraper IDE: You can design and structure your parser as individual Collectors or as a single Collector with multiple steps. To get started:
  • Click on the “Web Data Collection” icon on the right.
  • Navigate to the “My Scrapers” tab.
  • Select the “Develop a Web Scraper (IDE)” button.

From here, you can build from scratch or explore available templates for guidance. Start here: Create a Data Collector

  1. Requesting a Custom Dataset: If you prefer us to handle it, you can request a custom dataset, and we’ll create the Data Collectors needed to deliver it. To do this, click on the “Request Datasets” button under the “My Datasets” tab and choose the option that best suits your needs. Start here: Request a Custom Dataset

Any system limitations?

We have a limitation of 100 parallel-running jobs. When more than 100 jobs are triggered, the additional jobs are placed in a queue and wait until the earlier ones finish.  

How do i use the AI code generator?

Overview
The goal of this feature is to generate a custom code template tailored to your target website. Simply enter the target URL, and we’ll automatically generate a ready-to-use code template that you can edit or run as needed.

How does it work?
Simply enter your target URL and click “Generate Code.” Once the code is ready, it will appear in the IDE tab - no need to wait around while the AI processes your request. You’ll receive an email notification as soon as the code is ready.

Note: This feature works on PDP (Product Detail Page) URLs - when you already know your target URL, it generates parser code accordingly. It is not ideally suited for “Discovery” use cases, where the goal is to find those target URLs.

Assistant
Responses are generated using AI and may contain mistakes.

Was this page helpful?

linkedinyoutubegithub
Powered by Mintlify