- Web Unlocker API
- SERP API
- Scraping Browser
- Web Scraper IDE
- Web Scraper API
- Web Archive API (Beta)
- Easy Scraper
- Browser Extension
- Bright Shield
Web Scraper API FAQs
Find answers to FAQs about Bright Data’s Web Scraper API, covering setup, authentication, data formats, pricing, and large-scale data extraction.
The Web Scraper API allows users to extract fresh data on demand from websites using pre-built scrapers. It can be used to automate data collection and integrate with other systems.
Data analysts, scientists, engineers, and developers or individuals seeking efficient methods to collect and analyze web data for AI, ML, big data applications, and more with no scraping development efforts will find Scraper APIs particularly beneficial.
Getting started with Scraper APIs is straightforward, once you open your Bright Data account, you will need to generate an API Token from your account settings. Once you have your key, you can refer to our API documentation for detailed instructions on making your first API call.
Each scraper can require different inputs. There are 2 main types of scrapers:
-
PDP
These scrapers require URLs as inputs. A PDP scraper extracts detailed product information like specifications, pricing, and features from web pages -
Discovery/ Discovery+PDP
Discovery scrapers allow you to explore and find new entities/products through search, categories, Keywords and more.
Each discovery API allow you to find the desired data using a different method, it can by keyword, category URL or even location
Authentication is done using an API token. Include the token in the Authorization
header of your requests as follows: Authorization: Bearer YOUR_API_TOKEN
.
Once picking the API you want to run, you can customize your request using our detailed API parameters documentation, specifying the different types and expected inputs and responses.
You get 20 free API calls on the account level for experimenting with the product to use for PDP type scrapers with up to 10 inputs on each call, (Discovery type scrapers are not included in the trial).
-
Calls 1-5 will return full results
-
Calls 6-15 will return partially censored results (e.g., AB*****YZ)
You can quickly test the product by customizing the code on the control panel (Demo video)
Pick your desired API from the variety of APIs
Enter your inputs
Enter your API token
Select your preferred delivery method
Using a webhook - update the webhook URL and copy paste the “trigger data collection” code using and run the code on your client.
Using an API - fill out the needed credentials and information based on the specific setting you chose (S3
, GCP
, pubsub
and more) and copy the code and run the code after collection ends
Copy the code and run it on your client
All of the above can also be done by free tools such as Webhook-site and Postman
We also offer additional management APIs to acquire information about the collection status and fetch a list of all the snapshots under Management APIs tab
The Web scraper API supports data extraction in various formats including JSON
, NDJSON
, JSONL
and CSV
. Specify your desired format in the request parameters.
We charge based on the number of records we delivered, you only pay for what you get, do note that unsuccessful attempts resulting from incorrect inputs by the user will still be billed. Since the failure to retrieve data was due to user input rather than our system’s performance, resources were still consumed in processing the request. The rate per record depends on your subscription plan (starting from 0.7$ per 1000 records). Check our pricing plans or your account details for specific rates.
For account admins: If your API token expires, you need to create a new one in your account settings.
For account users: If your API token expires, please contact your account admin to issue a new token.
Featuring capabilities for high concurrency and batch processing, Scraper APIs excel in large-scale data extraction scenarios. This ensures developers can scale their scraping operations efficiently, accommodating massive volumes of requests with high throughput.
To upgrade your subscription plan, visit the billing section on your dashboard account and select the desired plan. For further assistance, contact our support team.
The Web Scraper APIs support a vast range of Use cases including competitive benchmarking, market trend analysis, dynamic pricing algorithms, sentiment extraction, and feeding data into machine learning pipelines. Essential for e-commerce, fintech, and social media analytics, these APIs empower developers to implement data-driven strategies effectively.
We offer real-time support for scrapers using URLs as inputs, with up to 20 URL inputs, and batch support for more than 20 inputs, regardless of the scraper type.
The Web Scraper API delivers real-time data for up to 20 inputs per call, with response times varying by domain, ensuring fresh data without relying on cached information.
Scrapers that discover new records (e.g., “Discover by keyword,” “Discover by hashtag”) generally take longer and use batch support, as the actual response times can be influenced by several factors, including the target URL’s load time and the execution duration of user-defined Page Interactions. An indication of the average response time for each scraper can be found on the specific Scraper page.
You can cancel a run using the following endpoint:
curl -H “Authorization: Bearer TOKEN” -H “Content-Type: application/json” -k “https://api.brightdata.com/datasets/v3/snapshot/SNAPSHOT\_ID/cancel” -X POST
Make sure the snapshot id is the one you want to cancel.
Note: If you cancel the run no data will be delivered to you and a snapshot can’t be canceled after it finished collecting
The key difference between a notify URL and a webhook URL in API configurations lies in their purpose and usage:
Notify URL:
Typically used for asynchronous communication. The system sends a notification to the specified URL when a task is completed or when an event occurs. The notification is often lightweight and doesn’t include detailed data but may provide a reference or status for further action (e.g., “Job completed, check logs for details”).
Webhook URL:
Also used for asynchronous communication but is more data-centric. The system pushes detailed, real-time data payloads to the specified URL when a specific event occurs. Webhooks provide direct, actionable information without requiring the client to poll the system.
Example Use Case:
A notify URL might be used to inform you that a scraping job is finished. A webhook URL could send the actual scraped data or detailed metadata about the completion directly to you.
The snapshot is available for 30 days, you can retrieve the snapshot during this time period via delivery API options and the snapshot ID
There are certain limitations on these platforms:
Posts (by profile URL) | up to 900 posts per input |
Comments | up to 50 comments per input |
Reels | up to 1600 per input |
Posts (by keyword) | up to 150,000 |
Posts (by profile URL) | up to 43,000 |
Comments | up to 9 per input |
Reels | up to 9000 |
Media Links expiring after 24 hours.
Profiles | up to 1000 records per input |
Posts (by keyword) | up to 1000 |
Posts (by profile URL) | up to 5000 |
Posts (by keyword) | up to 4000 per input |
Comments | all 1st level comments with no limit |
Profiles (by search URL) | up to 2000 per input |
Comments | up to 1000 per input |
Posts (by keyword) | up to 200 per input |
Posts (by profile URL) | up to 5000 per input |
Posts | up to 1000 per input |
Posts(by keyword) | up to 4000 per input |
Posts(by URL) | up to 9000 per input |
Posts | up to 1000 per input |
Profiles | up to 500 per input |
Posts (by keyword) | up to 600 per input |
Posts (by URL) | up to 20,000 per input |
Posts (by search filters) | up to 700 per input |
Media only accessible with a generated token in the same session.
Posts are limited to amount that is shown publicly on profile (e.g. 10)
When a snapshot is marked as empty, it means there are no valid or usable records in the snapshot. However, this does not imply the snapshot is completely devoid of content. In most cases, it contains information such as errors or dead pages:
-
Errors: Issues encountered during the data collection process, such as invalid inputs, system errors, or access restrictions.
-
Dead Pages: Pages that could not be accessed for reasons like 404 errors (page not found), removed content (e.g., unavailable products), or restricted access.
To view these details, you can use the parameter include_errors=true
in your request, which will display the errors and information about the dead pages in the snapshot. This helps you diagnose and understand the issues within the snapshot.
You can stop a running collection by utilizing the following API call: https://docs.brightdata.com/scraping-automation/web-scraper-api/management-apis/cancel-snapshot
ae.com
airbnb.com
amazon.com
apps.apple.com
ashleyfurniture.com
asos.com
balenciaga.com
bbc.com
berluti.com
bestbuy.com
booking.com
bottegaveneta.com
bsky.app
carsales.com.au
carters.com
celine.com
chanel.com
chileautos.cl
crateandbarrel.com
creativecommons.org
crunchbase.com
delvaux.com
digikey.com
dior.com
ebay.com
edition.cnn.com
en.wikipedia.org
enricheddata.com
espn.com
etsy.com
example.com
facebook.com
fanatics.com
fendi.com
finance.yahoo.com
g2.com
github.com
glassdoor.com
global.llbean.com
goodreads.com
google.com
hermes.com
homedepot.ca
homedepot.com
ikea.com
imdb.com
indeed.com
infocasas.com.uy
inmuebles24.com
instagram.com
la-z-boy.com
lazada.com.my
lazada.sg
lazada.vn
lego.com
linkedin.com
loewe.com
lowes.com
manta.com
martindale.com
massimodutti.com
mattressfirm.com
mediamarkt.de
metrocuadrado.com
montblanc.com
mouser.com
moynat.com
mybobs.com
myntra.com
news.google.com
nordstrom.com
olx.com
otodom.pl
owler.com
ozon.ru
pinterest.com
pitchbook.com
play.google.com
prada.com
properati.com.co
raymourflanigan.com
realestate.com.au
reddit.com
reuters.com
revenuebase.ai
sephora.fr
shop.mango.com
shopee.co.id
sleepnumber.com
slintel.com
target.com
tiktok.com
toctoc.com
tokopedia.com
toysrus.com
trustpilot.com
trustradius.com
unashamedcataddicts.quora.com
us.shein.com
ventureradar.com
vimeo.com
walmart.com
wayfair.com
webmotors.com.br
wildberries.ru
worldpopulationreview.com
worldpostalcode.com
www2.hm.com
x.com
xing.com
yapo.cl
yelp.com
youtube.com
ysl.com
zalando.de
zara.com
zarahome.com
zillow.com
zonaprop.com.ar
zoominfo.com
zoopla.co.uk
If your target domain is not on this list, we can develop a custom scraper specifically for you
We don’t provide dedicated scrapers specifically for hotels, but we do offer a Booking.com scraper and the option to create a custom scraper tailored to your specific requirements.
Here’s a quick guide to help you get started and choose the right solution for your needs:
- Option 1: Enriched, Pre-Collected Data – Explore Our Datasets Marketplace
If you’re looking for ready-to-use, high-quality data, our Datasets Marketplace is the perfect place to start. We’ve already done the heavy lifting by collecting and enriching vast amounts of data from a variety of sources. These datasets are designed to save you time and effort, so you can focus on analyzing the data and making smarter decisions.
Simply browse our marketplace, find the dataset that fits your needs, and start using it right away.
Option 2: Web Scrapers for Fresh and Real-Time Data
If your project requires fresh data or highly specific information that isn’t available in our Datasets Marketplace, we offer powerful tools to help you collect fresh and real-time data directly from the web. Here’s how you can get started:
Pre-Built Web Scrapers We offer a wide range of pre-built web scrapers for popular websites, allowing you to collect data quickly and efficiently. These scrapers are ready to use and require minimal setup, making them a great choice for users who want to hit the ground running.
Custom Scrapers
Can’t find your target website in our list of pre-built scrapers? No problem! We can create a custom scraper tailored specifically to your needs. Our team of experts will work with you to design a solution that collects the exact data you’re looking for.
Build Your Own Scraper
For users with JavaScript knowledge or access to developer resources, we also offer the option to build your own scraper using our Integrated Development Environment (IDE). This gives you full control and flexibility to create a scraper that meets your unique requirements.
Have questions or need assistance? Our team of experts is always here to help. Let’s get started!
Was this page helpful?