- Web Unlocker
- SERP API
- Scraping Browser
- Web Scraper IDE
- Web Data APIs
- Easy Scraper
- Browser Extension
- Bright Shield
Web Scraper IDE FAQs
Find answers to common questions about Bright Data’s Web Scraper IDE, including setup, troubleshooting, and best practices for developing custom data scrapers.
Bright Data Web Scrapers are automated tools that enable businesses to automatically collect all types of public online data on a mass scale, while heavily reducing in-house expenses on proxy maintenance and development.
The Web Scraper delivers enormous amounts of raw data in a structured format, and integrates with existing systems, for immediate use in competitive data-driven decisions.
Bright Data has developed hundreds of Web Scrapers customized to popular platforms.
Web Scraper IDE is an integrated development environment. The IDE is a Public web data on any scale at your fingertips, you can:
-
Build your scraper in minutes
-
Debug and diagnose with ease
-
Bring to production quickly
-
Browser scripting in simple Javascript
When collecting data, your “input” are the parameters you’ll enter to run your collection with. This can include keywords, URL, search items, product ID, ASIN, profile name, check in and check out dates, etc.
The output is the data that you’ve collected from a platform based on your input parameters. You’ll receive your data as JSON/NDJSON/CSV/XLSX.
Each free trial includes 100 records (note: 100 records does not mean 100 page loads).
You’ll always receive a higher number of records than the inputs you’ve requested.
Number of followers, average number of likes for posts, level of engagement, account theme, social and demographic portrait of the audience, social listening: keywords/ brand mentions, sentiments, viral trends.
Yes, we can collect data from large numbers of websites at the same time.
Yes you can, you can ask your account manager for help, or you can open a ticket related to the specific Web Scraper by selecting ‘Report an issue.’ Then request that fields be added or removed from your Web Scraper.
In cases where you don’t know a specific URL, you can search for a term and get data based on that term.
With a discovery scraper, you enter a URL(s) and collect all data from that page(s). You’ll receive data without having to specify a specific product or keyword.
Yes, the code is in JS, for self-managed scrapers you can change it according to your requirements.
We have 3 option to initiate requests:
-
Initiate by API - regular request, queue request and replace request.
-
Initiate manually.
-
Schedule mode
There are two ways to use the data collection tool:
When you are sending more than one API request, a “queue request” means that you’d like your next request to start automatically after your first request is completed, and so on with all other requests.
CPM = 1000 page loads
Billable events:
-
navigate()
-
request()
-
load_more()
-
(later) media file download
You’ll receive an email that the developer is working on your new Web Scraper, and you will be notified when your scraper is ready.
Status of the request can also be found on your dashboard :
You can use this form to communicate any issues you have with the platform, the scraper, or the dataset results.
Tickets will be assigned to a different department depending on selected issue type. Please make sure to choose the most relevant type.*
Select a job ID : issued Dataset
Select a type of the issue
This option is only available for managed scrapers. Tickets will be sent direct to your scraper engineer.
-
Missing fields
-
Missing records
-
Missing values
-
Parsing issues: The dataset results are incorrect
This type of tickets will be addressed to our support agents.
-
Incomplete delivery: Something went wrong during the delivery
-
Scraper is slow: The scraper is collecting results slowly or stuck
This type of tickets will be addressed to your account manager.
-
UI issue : UI does not operate correctly
-
Product question: General questions regarding using Web Scraper product
-
Something else is going wrong
(Parsing issues) Use the “bug” red icon to indicate where the incorrect results are
(Parsing issues) Enter the results you expect to receive
Write a description of what went wrong and the URL where the data is collected
If needed, attach an image to support your report
When input/output schema is updated, the scraper needs to be updated to match new schema. If the scraper is in work and not updated yet, you’ll see ‘Incompatible input/output schema’ error.
If you want to initiate it ignoring schema change, you can click ‘Trigger anyway’ on UI. API, you can add
-
output schema incompatible:
override_incompatible_schema=1
-
input schema incompatible:
override_incompatible_input_schema=1
parameter when triggering the scraper:
curl "https://api.brightdata.com/dca/trigger?scraper=ID_COLLECTOR&queue_next=1&override_incompatible_schema=1" -H "Content-Type: application/json" -H "Authorization: Bearer API_TOKEN" -d "[{\"url\":\"https://targetwebsite.com/product_id/\"}]"
We store the last 1000 errors inside the virtual job record so you can see example inputs that were wrong (there’s a CP button to view the errors in the IDE).
The customer should already know which inputs were wrong because they got an ‘error’ response for them. You can re-run these manually in the IDE to see what happened. This is just like providing a CURL request example when the unblocker isn’t behaving right.
Select “Report an issue” from the Bright Data Control Panel. Once you report your issue, an automatic ticket will be assigned to one of our 14 developers that monitor all tickets on a dailybasis. Make sure to provide details of what the problem is, and if you are not sure, please contact your account manager. Once you report an issue, you don’t need to do anything else, and you’ll receive an email confirming that the issue was reported.
Please provide the following information when reporting an issue:
-
Select the type of problem you’re facing (for example: getting the wrong results/missing data points/the results never loaded/delivery issue/ UI issue/scraper is slow/IDE issue/other)
-
Please describe in detail the problem that you are facing
-
You may upload a file that describes the problem
After reporting an issue, we’ll automatically open a ticket that will be promptly handled by our R&D Department.
In the past, we referred to all of our scraping tools as “Collectors.” A Collector is essentially a web scraper that consists of both interaction code and parser code. It can operate as an HTTP request or in a real browser, with all requests routed through our unlocker network to prevent blocking.
Over time, we developed a Dataset Unit that builds on top of one or more Collectors. For example, with a single Collector (direct request), you can scrape a specific URL—such as a product page from an e-commerce site—and receive the parsed data. In more complex scenarios, multiple Collectors can work together, such as when discovering and scraping categories, followed by collecting data on every product within those categories.
You have a few options to create and configure a Data Collector:
- Using the Web Scraper IDE: You can design and structure your parser as individual Collectors or as a single Collector with multiple steps. To get started:
-
Click on the “Web Data Collection” icon on the right.
-
Navigate to the “My Scrapers” tab.
-
Select the “Develop a Web Scraper (IDE)” button.
From here, you can build from scratch or explore available templates for guidance. Start here: Create a Data Collector
- Requesting a Custom Dataset: If you prefer us to handle it, you can request a custom dataset, and we’ll create the Data Collectors needed to deliver it. To do this, click on the “Request Datasets” button under the “My Datasets” tab and choose the option that best suits your needs. Start here: Request a Custom Dataset
We have a limitation of 100 parallel-running jobs. When more than 100 jobs are triggered, the additional jobs are placed in a queue and wait until the earlier ones finish.