Find answers to common questions about Bright Data’s Web Scraper IDE, including setup, troubleshooting, and best practices for developing custom data scrapers.
What is a Bright Data Web Scraper?
Bright Data Web Scrapers are automated tools that enable businesses to automatically collect all types of public online data on a mass scale, while heavily reducing in-house expenses on proxy maintenance and development.
The Web Scraper delivers enormous amounts of raw data in a structured format, and integrates with existing systems, for immediate use in competitive data-driven decisions.
Bright Data has developed hundreds of Web Scrapers customized to popular platforms.
What is Web Scraper IDE?
Web Scraper IDE is an integrated development environment. The IDE is a Public web data on any scale at your fingertips, you can:
What is an “input” when using a Web Scraper?
When collecting data, your “input” are the parameters you’ll enter to run your collection with. This can include keywords, URL, search items, product ID, ASIN, profile name, check in and check out dates, etc.
What is an “output” when using a Web Scraper?
The output is the data that you’ve collected from a platform based on your input parameters. You’ll receive your data as JSON/NDJSON/CSV/XLSX.
How many free records are included with my free trial?
Each free trial includes 100 records (note: 100 records does not mean 100 page loads).
Why did I receive more statistic records than inputs?
You’ll always receive a higher number of records than the inputs you’ve requested.
What are the most frequent data points collected from social media?
Can I collect data from multiple platforms?
Yes, we can collect data from large numbers of websites at the same time.
Can I add additional information to my Web Scraper?
Yes you can, you can ask your account manager for help, or you can open a ticket related to the specific Web Scraper by selecting ‘Report an issue.’ Then request that fields be added or removed from your Web Scraper.
What is a search scraper?
In cases where you don’t know a specific URL, you can search for a term and get data based on that term.
What is a discovery scraper?
With a discovery scraper, you enter a URL(s) and collect all data from that page(s). You’ll receive data without having to specify a specific product or keyword.
Can I change the code in the IDE by myself?
Yes, the code is in JS, for self-managed scrapers you can change it according to your requirements.
What are the options to initiate requests?
We have 3 option to initiate requests:
How to start using the Web Scraper?
There are two ways to use the data collection tool:
What is a queue request?
When you are sending more than one API request, a “queue request” means that you’d like your next request to start automatically after your first request is completed, and so on with all other requests.
What is a CPM?
CPM = 1000 page loads
When building a scraper, what is considered as a billable event?
Billable events:
navigate()
request()
load_more()
How can I confirm that someone is working on the new Web Scraper I requested?
You’ll receive an email that the developer is working on your new Web Scraper, and you will be notified when your scraper is ready.
Status of the request can also be found on your dashboard :
How to report an issue on the Web scraper IDE?
You can use this form to communicate any issues you have with the platform, the scraper, or the dataset results.
Tickets will be assigned to a different department depending on selected issue type. Please make sure to choose the most relevant type.*
Select a job ID : issued Dataset
Select a type of the issue
Data
This option is only available for managed scrapers. Tickets will be sent direct to your scraper engineer.
Collection and Delivery
This type of tickets will be addressed to our support agents.
Other
This type of tickets will be addressed to your account manager.
(Parsing issues) Use the “bug” red icon to indicate where the incorrect results are
(Parsing issues) Enter the results you expect to receive
Write a description of what went wrong and the URL where the data is collected
If needed, attach an image to support your report
I updated input/output schema of my managed scraper. Can I use it while BrightData updates my scraper?
When input/output schema is updated, the scraper needs to be updated to match new schema. If the scraper is in work and not updated yet, you’ll see ‘Incompatible input/output schema’ error.
If you want to initiate it ignoring schema change, you can click ‘Trigger anyway’ on UI. API, you can add
override_incompatible_schema=1
override_incompatible_input_schema=1
parameter when triggering the scraper:
How can I debug real time scrapers?
We store the last 1000 errors inside the virtual job record so you can see example inputs that were wrong (there’s a CP button to view the errors in the IDE).
The customer should already know which inputs were wrong because they got an ‘error’ response for them. You can re-run these manually in the IDE to see what happened. This is just like providing a CURL request example when the unblocker isn’t behaving right.
What should I do if I face an issue with a Web Scraper?
Select “Report an issue” from the Bright Data Control Panel. Once you report your issue, an automatic ticket will be assigned to one of our 14 developers that monitor all tickets on a dailybasis. Make sure to provide details of what the problem is, and if you are not sure, please contact your account manager. Once you report an issue, you don’t need to do anything else, and you’ll receive an email confirming that the issue was reported.
When “reporting an issue”, what information should I include in my report?
Please provide the following information when reporting an issue:
After reporting an issue, we’ll automatically open a ticket that will be promptly handled by our R&D Department.
What is a Data Collector?
In the past, we referred to all of our scraping tools as “Collectors.” A Collector is essentially a web scraper that consists of both interaction code and parser code. It can operate as an HTTP request or in a real browser, with all requests routed through our unlocker network to prevent blocking.
Over time, we developed a Dataset Unit that builds on top of one or more Collectors. For example, with a single Collector (direct request), you can scrape a specific URL—such as a product page from an e-commerce site—and receive the parsed data. In more complex scenarios, multiple Collectors can work together, such as when discovering and scraping categories, followed by collecting data on every product within those categories.
How to Create a Data Collector?
You have a few options to create and configure a Data Collector:
From here, you can build from scratch or explore available templates for guidance. Start here: Create a Data Collector
Any system limitations?
We have a limitation of 100 parallel-running jobs. When more than 100 jobs are triggered, the additional jobs are placed in a queue and wait until the earlier ones finish.
How do i use the AI code generator?
Overview
The goal of this feature is to generate a custom code template tailored to your target website. Simply enter the target URL, and we’ll automatically generate a ready-to-use code template that you can edit or run as needed.
How does it work?
Simply enter your target URL and click “Generate Code.” Once the code is ready, it will appear in the IDE tab - no need to wait around while the AI processes your request. You’ll receive an email notification as soon as the code is ready.
Note: This feature works on PDP (Product Detail Page) URLs - when you already know your target URL, it generates parser code accordingly. It is not ideally suited for “Discovery” use cases, where the goal is to find those target URLs.
Find answers to common questions about Bright Data’s Web Scraper IDE, including setup, troubleshooting, and best practices for developing custom data scrapers.
What is a Bright Data Web Scraper?
Bright Data Web Scrapers are automated tools that enable businesses to automatically collect all types of public online data on a mass scale, while heavily reducing in-house expenses on proxy maintenance and development.
The Web Scraper delivers enormous amounts of raw data in a structured format, and integrates with existing systems, for immediate use in competitive data-driven decisions.
Bright Data has developed hundreds of Web Scrapers customized to popular platforms.
What is Web Scraper IDE?
Web Scraper IDE is an integrated development environment. The IDE is a Public web data on any scale at your fingertips, you can:
What is an “input” when using a Web Scraper?
When collecting data, your “input” are the parameters you’ll enter to run your collection with. This can include keywords, URL, search items, product ID, ASIN, profile name, check in and check out dates, etc.
What is an “output” when using a Web Scraper?
The output is the data that you’ve collected from a platform based on your input parameters. You’ll receive your data as JSON/NDJSON/CSV/XLSX.
How many free records are included with my free trial?
Each free trial includes 100 records (note: 100 records does not mean 100 page loads).
Why did I receive more statistic records than inputs?
You’ll always receive a higher number of records than the inputs you’ve requested.
What are the most frequent data points collected from social media?
Can I collect data from multiple platforms?
Yes, we can collect data from large numbers of websites at the same time.
Can I add additional information to my Web Scraper?
Yes you can, you can ask your account manager for help, or you can open a ticket related to the specific Web Scraper by selecting ‘Report an issue.’ Then request that fields be added or removed from your Web Scraper.
What is a search scraper?
In cases where you don’t know a specific URL, you can search for a term and get data based on that term.
What is a discovery scraper?
With a discovery scraper, you enter a URL(s) and collect all data from that page(s). You’ll receive data without having to specify a specific product or keyword.
Can I change the code in the IDE by myself?
Yes, the code is in JS, for self-managed scrapers you can change it according to your requirements.
What are the options to initiate requests?
We have 3 option to initiate requests:
How to start using the Web Scraper?
There are two ways to use the data collection tool:
What is a queue request?
When you are sending more than one API request, a “queue request” means that you’d like your next request to start automatically after your first request is completed, and so on with all other requests.
What is a CPM?
CPM = 1000 page loads
When building a scraper, what is considered as a billable event?
Billable events:
navigate()
request()
load_more()
How can I confirm that someone is working on the new Web Scraper I requested?
You’ll receive an email that the developer is working on your new Web Scraper, and you will be notified when your scraper is ready.
Status of the request can also be found on your dashboard :
How to report an issue on the Web scraper IDE?
You can use this form to communicate any issues you have with the platform, the scraper, or the dataset results.
Tickets will be assigned to a different department depending on selected issue type. Please make sure to choose the most relevant type.*
Select a job ID : issued Dataset
Select a type of the issue
Data
This option is only available for managed scrapers. Tickets will be sent direct to your scraper engineer.
Collection and Delivery
This type of tickets will be addressed to our support agents.
Other
This type of tickets will be addressed to your account manager.
(Parsing issues) Use the “bug” red icon to indicate where the incorrect results are
(Parsing issues) Enter the results you expect to receive
Write a description of what went wrong and the URL where the data is collected
If needed, attach an image to support your report
I updated input/output schema of my managed scraper. Can I use it while BrightData updates my scraper?
When input/output schema is updated, the scraper needs to be updated to match new schema. If the scraper is in work and not updated yet, you’ll see ‘Incompatible input/output schema’ error.
If you want to initiate it ignoring schema change, you can click ‘Trigger anyway’ on UI. API, you can add
override_incompatible_schema=1
override_incompatible_input_schema=1
parameter when triggering the scraper:
How can I debug real time scrapers?
We store the last 1000 errors inside the virtual job record so you can see example inputs that were wrong (there’s a CP button to view the errors in the IDE).
The customer should already know which inputs were wrong because they got an ‘error’ response for them. You can re-run these manually in the IDE to see what happened. This is just like providing a CURL request example when the unblocker isn’t behaving right.
What should I do if I face an issue with a Web Scraper?
Select “Report an issue” from the Bright Data Control Panel. Once you report your issue, an automatic ticket will be assigned to one of our 14 developers that monitor all tickets on a dailybasis. Make sure to provide details of what the problem is, and if you are not sure, please contact your account manager. Once you report an issue, you don’t need to do anything else, and you’ll receive an email confirming that the issue was reported.
When “reporting an issue”, what information should I include in my report?
Please provide the following information when reporting an issue:
After reporting an issue, we’ll automatically open a ticket that will be promptly handled by our R&D Department.
What is a Data Collector?
In the past, we referred to all of our scraping tools as “Collectors.” A Collector is essentially a web scraper that consists of both interaction code and parser code. It can operate as an HTTP request or in a real browser, with all requests routed through our unlocker network to prevent blocking.
Over time, we developed a Dataset Unit that builds on top of one or more Collectors. For example, with a single Collector (direct request), you can scrape a specific URL—such as a product page from an e-commerce site—and receive the parsed data. In more complex scenarios, multiple Collectors can work together, such as when discovering and scraping categories, followed by collecting data on every product within those categories.
How to Create a Data Collector?
You have a few options to create and configure a Data Collector:
From here, you can build from scratch or explore available templates for guidance. Start here: Create a Data Collector
Any system limitations?
We have a limitation of 100 parallel-running jobs. When more than 100 jobs are triggered, the additional jobs are placed in a queue and wait until the earlier ones finish.
How do i use the AI code generator?
Overview
The goal of this feature is to generate a custom code template tailored to your target website. Simply enter the target URL, and we’ll automatically generate a ready-to-use code template that you can edit or run as needed.
How does it work?
Simply enter your target URL and click “Generate Code.” Once the code is ready, it will appear in the IDE tab - no need to wait around while the AI processes your request. You’ll receive an email notification as soon as the code is ready.
Note: This feature works on PDP (Product Detail Page) URLs - when you already know your target URL, it generates parser code accordingly. It is not ideally suited for “Discovery” use cases, where the goal is to find those target URLs.