Bright Data Docs home pagelight logodark logo
  • Support
  • Sign up
  • Sign up
Welcome
Proxy Infrastructure
Web Access APIs
Data Feeds
API Reference
General
Integrations

Web Scraper API FAQs

Find answers to FAQs about Bright Data’s Web Scraper API, covering setup, authentication, data formats, pricing, and large-scale data extraction.

What is the Web Scraper API?

The Web Scraper API allows users to extract fresh data on demand from websites using pre-built scrapers. It can be used to automate data collection and integrate with other systems.

Who can benefit from using the Web scraper API?

Data analysts, scientists, engineers, and developers or individuals seeking efficient methods to collect and analyze web data for AI, ML, big data applications, and more with no scraping development efforts will find Scraper APIs particularly beneficial.

How do I get started with the Web Scraper API?

Getting started with Scraper APIs is straightforward, once you open your Bright Data account, you will need to generate an API key from your account settings. Once you have your key, you can refer to our API documentation for detailed instructions on making your first API call.

What is the difference between the scrapers?

Each scraper can require different inputs. There are 2 main types of scrapers:

  1. PDP
    These scrapers require URLs as inputs. A PDP scraper extracts detailed product information like specifications, pricing, and features from web pages
  2. Discovery/ Discovery+PDP
    Discovery scrapers allow you to explore and find new entities/products through search, categories, Keywords and more.

Why are there different discovery APIs for the same domain?

Each discovery API allow you to find the desired data using a different method, it can by keyword, category URL or even location

How do I authenticate with the Web Scraper API?

Authentication is done using an API key. Include the API key in the Authorization header of your requests as follows: Authorization: Bearer YOUR_API_KEY.

How do I customize my request and trigger it?

Once picking the API you want to run, you can customize your request using our detailed API parameters documentation, specifying the different types and expected inputs and responses.

How does the trial works?

You get 20 free API calls on the account level for experimenting with the product to use for PDP type scrapers with up to 10 inputs on each call, (Discovery type scrapers are not included in the trial).

  • Calls 1-5 will return full results
  • Calls 6-15 will return partially censored results (e.g., AB*****YZ)

How do I test the API?

You can quickly test the product by customizing the code on the control panel (Demo video)

1

Pick your desired API from the variety of APIs

2

Enter your inputs

3

Enter your API key

4

Select your preferred delivery method

Using a webhook - update the webhook URL and copy paste the “trigger data collection” code using and run the code on your client.

Using an API - fill out the needed credentials and information based on the specific setting you chose (S3, GCP, pubsub and more) and copy the code and run the code after collection ends

5

Copy the code and run it on your client

All of the above can also be done by free tools such as Webhook-site and Postman

We also offer additional management APIs to acquire information about the collection status and fetch a list of all the snapshots under Management APIs tab

What data formats does the Web Scraper API support?

The Web scraper API supports data extraction in various formats including JSON, NDJSON, JSONL and CSV. Specify your desired format in the request parameters.

What are the rates for the Web Scraper API?

We charge based on the number of records we delivered, you only pay for what you get, do note that unsuccessful attempts resulting from incorrect inputs by the user will still be billed. Since the failure to retrieve data was due to user input rather than our system’s performance, resources were still consumed in processing the request. The rate per record depends on your subscription plan (starting from 0.7$ per 1000 records). Check our pricing plans or your account details for specific rates.

What should I do if my API key expires?

For account admins: If your API key expires, you need to create a new one in your account settings.

For account users: If your API key expires, please contact your account admin to issue a new API key.

How does Scraper APIs manage large-scale data extraction tasks?

Featuring capabilities for high concurrency and batch processing, Scraper APIs excel in large-scale data extraction scenarios. This ensures developers can scale their scraping operations efficiently, accommodating massive volumes of requests with high throughput.

How can I upgrade my subscription plan?

To upgrade your subscription plan, visit the billing section on your dashboard account and select the desired plan. For further assistance, contact our support team.

What specific use cases are Scraper APIs optimized for?

The Web Scraper APIs support a vast range of Use cases including competitive benchmarking, market trend analysis, dynamic pricing algorithms, sentiment extraction, and feeding data into machine learning pipelines. Essential for e-commerce, fintech, and social media analytics, these APIs empower developers to implement data-driven strategies effectively.

How fast is the Web Scraper API?

We offer real-time support for scrapers using URLs as inputs, with up to 20 URL inputs, and batch support for more than 20 inputs, regardless of the scraper type.

The Web Scraper API delivers real-time data for up to 20 inputs per call, with response times varying by domain, ensuring fresh data without relying on cached information.

Scrapers that discover new records (e.g., “Discover by keyword,” “Discover by hashtag”) generally take longer and use batch support, as the actual response times can be influenced by several factors, including the target URL’s load time and the execution duration of user-defined Page Interactions. An indication of the average response time for each scraper can be found on the specific Scraper page.

How do I cancel an API call?

You can cancel a run using the following endpoint:

Copy
curl -H "Authorization: API key" -H "Content-Type: application/json" -k "https://api.brightdata.com/datasets/v3/snapshot/SNAPSHOT_ID/cancel" -X POST

Make sure the snapshot id is the one you want to cancel.

Note: If you cancel the run no data will be delivered to you and a snapshot can’t be canceled after it finished collecting

What is the difference between a notify URL and a webhook URL configurations?

The key difference between a notify URL and a webhook URL in API configurations lies in their purpose and usage:

Notify URL:

Typically used for asynchronous communication. The system sends a notification to the specified URL when a task is completed or when an event occurs. The notification is often lightweight and doesn’t include detailed data but may provide a reference or status for further action (e.g., “Job completed, check logs for details”).

Webhook URL:

Also used for asynchronous communication but is more data-centric. The system pushes detailed, real-time data payloads to the specified URL when a specific event occurs. Webhooks provide direct, actionable information without requiring the client to poll the system.

Example Use Case:

A notify URL might be used to inform you that a scraping job is finished. A webhook URL could send the actual scraped data or detailed metadata about the completion directly to you.

For how long a snapshot is available after i triggered a collection?

The snapshot is available for 30 days, you can retrieve the snapshot during this time period via delivery API options and the snapshot ID

Are there any limitations for specific scrapers or domains?

There are certain limitations on these platforms:

Facebook

Posts (by profile URL)
Comments
Reels

Instagram

Posts (by keyword)
Posts (by profile URL)
Comments
Reels

Media Links expiring after 24 hours.

Pinterest

Profiles
Posts (by keyword)
Posts (by profile URL)

Reddit

Posts (by keyword)
Comments

TikTok

Profiles (by search URL)
Comments
Posts (by keyword)
Posts (by profile URL)

Quora

Posts

Vimeo

Posts(by keyword)
Posts(by URL)

X (Twitter)

Posts

YouTube

Profiles
Posts (by keyword)
Posts (by URL)
Posts (by search filters)

TikTok

Media only accessible with a generated token in the same session.

Linkedin

Posts are limited to amount that is shown publicly on profile (e.g. 10)

What does it mean when a snapshot is marked as empty?

When a snapshot is marked as empty, it means there are no valid or usable records in the snapshot. However, this does not imply the snapshot is completely devoid of content. In most cases, it contains information such as errors or dead pages:

  • Errors: Issues encountered during the data collection process, such as invalid inputs, system errors, or access restrictions.
  • Dead Pages: Pages that could not be accessed for reasons like 404 errors (page not found), removed content (e.g., unavailable products), or restricted access.

To view these details, you can use the parameter include_errors=true in your request, which will display the errors and information about the dead pages in the snapshot. This helps you diagnose and understand the issues within the snapshot.

How do I start the data collection when I have entered the CSV data?

How to stop a web scraper task?

You can stop a running collection by utilizing the following API call: https://docs.brightdata.com/scraping-automation/web-scraper-api/management-apis/cancel-snapshot

Which domains do you provide scrapers for?

ae.com

airbnb.com

amazon.com

apps.apple.com

ashleyfurniture.com

asos.com

balenciaga.com

bbc.com

berluti.com

bestbuy.com

booking.com

bottegaveneta.com

bsky.app

carsales.com.au

carters.com

celine.com

chanel.com

chileautos.cl

crateandbarrel.com

creativecommons.org

crunchbase.com

delvaux.com

digikey.com

dior.com

ebay.com

edition.cnn.com

en.wikipedia.org

enricheddata.com

espn.com

etsy.com

example.com

facebook.com

fanatics.com

fendi.com

finance.yahoo.com

g2.com

github.com

glassdoor.com

global.llbean.com

goodreads.com

google.com

hermes.com

homedepot.ca

homedepot.com

ikea.com

imdb.com

indeed.com

infocasas.com.uy

inmuebles24.com

instagram.com

la-z-boy.com

lazada.com.my

lazada.sg

lazada.vn

lego.com

linkedin.com

loewe.com

lowes.com

manta.com

martindale.com

massimodutti.com

mattressfirm.com

mediamarkt.de

metrocuadrado.com

montblanc.com

mouser.com

moynat.com

mybobs.com

myntra.com

news.google.com

nordstrom.com

olx.com

otodom.pl

owler.com

ozon.ru

pinterest.com

pitchbook.com

play.google.com

prada.com

properati.com.co

raymourflanigan.com

realestate.com.au

reddit.com

reuters.com

revenuebase.ai

sephora.fr

shop.mango.com

shopee.co.id

sleepnumber.com

slintel.com

target.com

tiktok.com

toctoc.com

tokopedia.com

toysrus.com

trustpilot.com

trustradius.com

unashamedcataddicts.quora.com

us.shein.com

ventureradar.com

vimeo.com

walmart.com

wayfair.com

webmotors.com.br

wildberries.ru

worldpopulationreview.com

worldpostalcode.com

www2.hm.com

x.com

xing.com

yapo.cl

yelp.com

youtube.com

ysl.com

zalando.de

zara.com

zarahome.com

zillow.com

zonaprop.com.ar

zoominfo.com

zoopla.co.uk

If your target domain is not on this list, we can develop a custom scraper specifically for you

How can I use Bright Data to access hotel data through an API?

We don’t provide dedicated scrapers specifically for hotels, but we do offer a Booking.com scraper and the option to create a custom scraper tailored to your specific requirements.

How do I get the data i need?

Here’s a quick guide to help you get started and choose the right solution for your needs:

  • Option 1: Enriched, Pre-Collected Data – Explore Our Datasets Marketplace

If you’re looking for ready-to-use, high-quality data, our Datasets Marketplace is the perfect place to start. We’ve already done the heavy lifting by collecting and enriching vast amounts of data from a variety of sources. These datasets are designed to save you time and effort, so you can focus on analyzing the data and making smarter decisions.

Simply browse our marketplace, find the dataset that fits your needs, and start using it right away.

Option 2: Web Scrapers for Fresh and Real-Time Data

If your project requires fresh data or highly specific information that isn’t available in our Datasets Marketplace, we offer powerful tools to help you collect fresh and real-time data directly from the web. Here’s how you can get started:

Pre-Built Web Scrapers We offer a wide range of pre-built web scrapers for popular websites, allowing you to collect data quickly and efficiently. These scrapers are ready to use and require minimal setup, making them a great choice for users who want to hit the ground running.

Custom Scrapers

Can’t find your target website in our list of pre-built scrapers? No problem\! We can create a custom scraper tailored specifically to your needs. Our team of experts will work with you to design a solution that collects the exact data you’re looking for.

Build Your Own Scraper

For users with JavaScript knowledge or access to developer resources, we also offer the option to build your own scraper using our Integrated Development Environment (IDE). This gives you full control and flexibility to create a scraper that meets your unique requirements.

Have questions or need assistance? Our team of experts is always here to help. Let’s get started\!

How do I scrape data from google maps?

  1. Find the “Google Maps reviews” scraper on the dashboard and choose if you want to run it as an API request or initiate it using the “No code” option from the control panel
  2. Enter the input parameters (The place page URL and, Number of days to retrieve reviews from)
  3. Configure the needed request parameters if using an API
  4. Initiate the run and collect the data

How do i cancel a running snapshot?

To cancel a running snapshot, use one of the following methods:

  1. API Request:
    • Send a POST request to the endpoint:

      POST /datasets/v3/snapshot/cencel (playgrownd)

    • Replace {snapshot_id} with the ID of the snapshot you want to cancel.

  2. Control Panel:
    • Go to the Logs tab of the scraper.
    • Locate the running snapshot.
    • Hover over the specific run and click the “X” to cancel it.

Both methods will stop the snapshot process if it is currently running.

Does the chatGPT scraper works with “SearchGPT” active?

Yes, Bright Data GPT scraper always works with the “Search” function active.

Can I view the code behind the scraper?

Scrapers available in the Web Scrapers Library are pre-built solutions, and their underlying code is not accessible for modification or viewing.
For those interested in seeing how scrapers work, the Web Scraper IDE provides several example templates when you create a new scraper. These examples serve as practical references to help you understand scraping techniques and build your own custom solutions.

Can i get the results directly to my machine or software while using the web scraper API?

Yes, using the web scraper API you can return the scrape data to the request point
Using the following endpoint - POST api.brightdata.com/datasets/v3/scrape
This endpoint allows you to fetch data efficiently and ensures seamless integration with your applications or workflows.

How does it works?
The API enables you to send a scraping request and receive the results directly at the request point. This eliminates the need for data retrieval or the need to send to external storage, streamlines your data collection process.

Limitations

  • For long collection operations the best practice is to use our tigger/ endpoing (In case the collection request is taking too long while using /scrape endpoint, you will get the
    snapshot ID, which you will use to download the data once ready)

What is a dataset id and where can I find it?

A Dataset ID is a unique identifier used in Web Scraper API requests. It’s included in the request URL to specify which particular Web Scraper you want to access. This ID ensures that your API call retrieves data from the correct scraper in our system. Here is how it is used: https://api.brightdata.com/datasets/v3/trigger?dataset_id=DATASET_ID_HERE

A dataset id will look like: gd_XXXXXXXXXXXXXXXXX For example: gd_l1viktl72bvl7bjuj0

You can find the exact dataset ID in the Web Scraper API page for the Scraper you are intrested in under the API Request Builder tab, it will be already inserted in the API request for you to copy.

Note: An id that looks like s_XXXXXXXXXXXXXXXXXXfor example: s_m7hm4et0141r2rhojq is not a dataset ID, it is a snapshot id - a snapshot is a collection of data that is collected from a single Web Scraper API request.

What is 'Discovery only' mode?

In Discovery-only mode, the results obtained during the discovery phase are returned as the final output and do not proceed to the PDP (Product Detail Page) stage.


For example, if an Amazon product discovery scraper is initiated in Discovery-only mode, it will return only the product URLs found during the discovery phase. When this mode is turned off, the scraper will continue to visit and extract data from each individual product page identified during discovery.

Assistant
Responses are generated using AI and may contain mistakes.

Was this page helpful?

linkedinyoutubegithub
Powered by Mintlify
  • Proxy Integrations
  • AI
  • Automation
  • Systems
  • Browser
  • Proxy Managers
  • Scraping tool
  • Testing

Web Scraper API FAQs

Find answers to FAQs about Bright Data’s Web Scraper API, covering setup, authentication, data formats, pricing, and large-scale data extraction.

What is the Web Scraper API?

The Web Scraper API allows users to extract fresh data on demand from websites using pre-built scrapers. It can be used to automate data collection and integrate with other systems.

Who can benefit from using the Web scraper API?

Data analysts, scientists, engineers, and developers or individuals seeking efficient methods to collect and analyze web data for AI, ML, big data applications, and more with no scraping development efforts will find Scraper APIs particularly beneficial.

How do I get started with the Web Scraper API?

Getting started with Scraper APIs is straightforward, once you open your Bright Data account, you will need to generate an API key from your account settings. Once you have your key, you can refer to our API documentation for detailed instructions on making your first API call.

What is the difference between the scrapers?

Each scraper can require different inputs. There are 2 main types of scrapers:

  1. PDP
    These scrapers require URLs as inputs. A PDP scraper extracts detailed product information like specifications, pricing, and features from web pages
  2. Discovery/ Discovery+PDP
    Discovery scrapers allow you to explore and find new entities/products through search, categories, Keywords and more.

Why are there different discovery APIs for the same domain?

Each discovery API allow you to find the desired data using a different method, it can by keyword, category URL or even location

How do I authenticate with the Web Scraper API?

Authentication is done using an API key. Include the API key in the Authorization header of your requests as follows: Authorization: Bearer YOUR_API_KEY.

How do I customize my request and trigger it?

Once picking the API you want to run, you can customize your request using our detailed API parameters documentation, specifying the different types and expected inputs and responses.

How does the trial works?

You get 20 free API calls on the account level for experimenting with the product to use for PDP type scrapers with up to 10 inputs on each call, (Discovery type scrapers are not included in the trial).

  • Calls 1-5 will return full results
  • Calls 6-15 will return partially censored results (e.g., AB*****YZ)

How do I test the API?

You can quickly test the product by customizing the code on the control panel (Demo video)

1

Pick your desired API from the variety of APIs

2

Enter your inputs

3

Enter your API key

4

Select your preferred delivery method

Using a webhook - update the webhook URL and copy paste the “trigger data collection” code using and run the code on your client.

Using an API - fill out the needed credentials and information based on the specific setting you chose (S3, GCP, pubsub and more) and copy the code and run the code after collection ends

5

Copy the code and run it on your client

All of the above can also be done by free tools such as Webhook-site and Postman

We also offer additional management APIs to acquire information about the collection status and fetch a list of all the snapshots under Management APIs tab

What data formats does the Web Scraper API support?

The Web scraper API supports data extraction in various formats including JSON, NDJSON, JSONL and CSV. Specify your desired format in the request parameters.

What are the rates for the Web Scraper API?

We charge based on the number of records we delivered, you only pay for what you get, do note that unsuccessful attempts resulting from incorrect inputs by the user will still be billed. Since the failure to retrieve data was due to user input rather than our system’s performance, resources were still consumed in processing the request. The rate per record depends on your subscription plan (starting from 0.7$ per 1000 records). Check our pricing plans or your account details for specific rates.

What should I do if my API key expires?

For account admins: If your API key expires, you need to create a new one in your account settings.

For account users: If your API key expires, please contact your account admin to issue a new API key.

How does Scraper APIs manage large-scale data extraction tasks?

Featuring capabilities for high concurrency and batch processing, Scraper APIs excel in large-scale data extraction scenarios. This ensures developers can scale their scraping operations efficiently, accommodating massive volumes of requests with high throughput.

How can I upgrade my subscription plan?

To upgrade your subscription plan, visit the billing section on your dashboard account and select the desired plan. For further assistance, contact our support team.

What specific use cases are Scraper APIs optimized for?

The Web Scraper APIs support a vast range of Use cases including competitive benchmarking, market trend analysis, dynamic pricing algorithms, sentiment extraction, and feeding data into machine learning pipelines. Essential for e-commerce, fintech, and social media analytics, these APIs empower developers to implement data-driven strategies effectively.

How fast is the Web Scraper API?

We offer real-time support for scrapers using URLs as inputs, with up to 20 URL inputs, and batch support for more than 20 inputs, regardless of the scraper type.

The Web Scraper API delivers real-time data for up to 20 inputs per call, with response times varying by domain, ensuring fresh data without relying on cached information.

Scrapers that discover new records (e.g., “Discover by keyword,” “Discover by hashtag”) generally take longer and use batch support, as the actual response times can be influenced by several factors, including the target URL’s load time and the execution duration of user-defined Page Interactions. An indication of the average response time for each scraper can be found on the specific Scraper page.

How do I cancel an API call?

You can cancel a run using the following endpoint:

Copy
curl -H "Authorization: API key" -H "Content-Type: application/json" -k "https://api.brightdata.com/datasets/v3/snapshot/SNAPSHOT_ID/cancel" -X POST

Make sure the snapshot id is the one you want to cancel.

Note: If you cancel the run no data will be delivered to you and a snapshot can’t be canceled after it finished collecting

What is the difference between a notify URL and a webhook URL configurations?

The key difference between a notify URL and a webhook URL in API configurations lies in their purpose and usage:

Notify URL:

Typically used for asynchronous communication. The system sends a notification to the specified URL when a task is completed or when an event occurs. The notification is often lightweight and doesn’t include detailed data but may provide a reference or status for further action (e.g., “Job completed, check logs for details”).

Webhook URL:

Also used for asynchronous communication but is more data-centric. The system pushes detailed, real-time data payloads to the specified URL when a specific event occurs. Webhooks provide direct, actionable information without requiring the client to poll the system.

Example Use Case:

A notify URL might be used to inform you that a scraping job is finished. A webhook URL could send the actual scraped data or detailed metadata about the completion directly to you.

For how long a snapshot is available after i triggered a collection?

The snapshot is available for 30 days, you can retrieve the snapshot during this time period via delivery API options and the snapshot ID

Are there any limitations for specific scrapers or domains?

There are certain limitations on these platforms:

Facebook

Posts (by profile URL)
Comments
Reels

Instagram

Posts (by keyword)
Posts (by profile URL)
Comments
Reels

Media Links expiring after 24 hours.

Pinterest

Profiles
Posts (by keyword)
Posts (by profile URL)

Reddit

Posts (by keyword)
Comments

TikTok

Profiles (by search URL)
Comments
Posts (by keyword)
Posts (by profile URL)

Quora

Posts

Vimeo

Posts(by keyword)
Posts(by URL)

X (Twitter)

Posts

YouTube

Profiles
Posts (by keyword)
Posts (by URL)
Posts (by search filters)

TikTok

Media only accessible with a generated token in the same session.

Linkedin

Posts are limited to amount that is shown publicly on profile (e.g. 10)

What does it mean when a snapshot is marked as empty?

When a snapshot is marked as empty, it means there are no valid or usable records in the snapshot. However, this does not imply the snapshot is completely devoid of content. In most cases, it contains information such as errors or dead pages:

  • Errors: Issues encountered during the data collection process, such as invalid inputs, system errors, or access restrictions.
  • Dead Pages: Pages that could not be accessed for reasons like 404 errors (page not found), removed content (e.g., unavailable products), or restricted access.

To view these details, you can use the parameter include_errors=true in your request, which will display the errors and information about the dead pages in the snapshot. This helps you diagnose and understand the issues within the snapshot.

How do I start the data collection when I have entered the CSV data?

How to stop a web scraper task?

You can stop a running collection by utilizing the following API call: https://docs.brightdata.com/scraping-automation/web-scraper-api/management-apis/cancel-snapshot

Which domains do you provide scrapers for?

ae.com

airbnb.com

amazon.com

apps.apple.com

ashleyfurniture.com

asos.com

balenciaga.com

bbc.com

berluti.com

bestbuy.com

booking.com

bottegaveneta.com

bsky.app

carsales.com.au

carters.com

celine.com

chanel.com

chileautos.cl

crateandbarrel.com

creativecommons.org

crunchbase.com

delvaux.com

digikey.com

dior.com

ebay.com

edition.cnn.com

en.wikipedia.org

enricheddata.com

espn.com

etsy.com

example.com

facebook.com

fanatics.com

fendi.com

finance.yahoo.com

g2.com

github.com

glassdoor.com

global.llbean.com

goodreads.com

google.com

hermes.com

homedepot.ca

homedepot.com

ikea.com

imdb.com

indeed.com

infocasas.com.uy

inmuebles24.com

instagram.com

la-z-boy.com

lazada.com.my

lazada.sg

lazada.vn

lego.com

linkedin.com

loewe.com

lowes.com

manta.com

martindale.com

massimodutti.com

mattressfirm.com

mediamarkt.de

metrocuadrado.com

montblanc.com

mouser.com

moynat.com

mybobs.com

myntra.com

news.google.com

nordstrom.com

olx.com

otodom.pl

owler.com

ozon.ru

pinterest.com

pitchbook.com

play.google.com

prada.com

properati.com.co

raymourflanigan.com

realestate.com.au

reddit.com

reuters.com

revenuebase.ai

sephora.fr

shop.mango.com

shopee.co.id

sleepnumber.com

slintel.com

target.com

tiktok.com

toctoc.com

tokopedia.com

toysrus.com

trustpilot.com

trustradius.com

unashamedcataddicts.quora.com

us.shein.com

ventureradar.com

vimeo.com

walmart.com

wayfair.com

webmotors.com.br

wildberries.ru

worldpopulationreview.com

worldpostalcode.com

www2.hm.com

x.com

xing.com

yapo.cl

yelp.com

youtube.com

ysl.com

zalando.de

zara.com

zarahome.com

zillow.com

zonaprop.com.ar

zoominfo.com

zoopla.co.uk

If your target domain is not on this list, we can develop a custom scraper specifically for you

How can I use Bright Data to access hotel data through an API?

We don’t provide dedicated scrapers specifically for hotels, but we do offer a Booking.com scraper and the option to create a custom scraper tailored to your specific requirements.

How do I get the data i need?

Here’s a quick guide to help you get started and choose the right solution for your needs:

  • Option 1: Enriched, Pre-Collected Data – Explore Our Datasets Marketplace

If you’re looking for ready-to-use, high-quality data, our Datasets Marketplace is the perfect place to start. We’ve already done the heavy lifting by collecting and enriching vast amounts of data from a variety of sources. These datasets are designed to save you time and effort, so you can focus on analyzing the data and making smarter decisions.

Simply browse our marketplace, find the dataset that fits your needs, and start using it right away.

Option 2: Web Scrapers for Fresh and Real-Time Data

If your project requires fresh data or highly specific information that isn’t available in our Datasets Marketplace, we offer powerful tools to help you collect fresh and real-time data directly from the web. Here’s how you can get started:

Pre-Built Web Scrapers We offer a wide range of pre-built web scrapers for popular websites, allowing you to collect data quickly and efficiently. These scrapers are ready to use and require minimal setup, making them a great choice for users who want to hit the ground running.

Custom Scrapers

Can’t find your target website in our list of pre-built scrapers? No problem\! We can create a custom scraper tailored specifically to your needs. Our team of experts will work with you to design a solution that collects the exact data you’re looking for.

Build Your Own Scraper

For users with JavaScript knowledge or access to developer resources, we also offer the option to build your own scraper using our Integrated Development Environment (IDE). This gives you full control and flexibility to create a scraper that meets your unique requirements.

Have questions or need assistance? Our team of experts is always here to help. Let’s get started\!

How do I scrape data from google maps?

  1. Find the “Google Maps reviews” scraper on the dashboard and choose if you want to run it as an API request or initiate it using the “No code” option from the control panel
  2. Enter the input parameters (The place page URL and, Number of days to retrieve reviews from)
  3. Configure the needed request parameters if using an API
  4. Initiate the run and collect the data

How do i cancel a running snapshot?

To cancel a running snapshot, use one of the following methods:

  1. API Request:
    • Send a POST request to the endpoint:

      POST /datasets/v3/snapshot/cencel (playgrownd)

    • Replace {snapshot_id} with the ID of the snapshot you want to cancel.

  2. Control Panel:
    • Go to the Logs tab of the scraper.
    • Locate the running snapshot.
    • Hover over the specific run and click the “X” to cancel it.

Both methods will stop the snapshot process if it is currently running.

Does the chatGPT scraper works with “SearchGPT” active?

Yes, Bright Data GPT scraper always works with the “Search” function active.

Can I view the code behind the scraper?

Scrapers available in the Web Scrapers Library are pre-built solutions, and their underlying code is not accessible for modification or viewing.
For those interested in seeing how scrapers work, the Web Scraper IDE provides several example templates when you create a new scraper. These examples serve as practical references to help you understand scraping techniques and build your own custom solutions.

Can i get the results directly to my machine or software while using the web scraper API?

Yes, using the web scraper API you can return the scrape data to the request point
Using the following endpoint - POST api.brightdata.com/datasets/v3/scrape
This endpoint allows you to fetch data efficiently and ensures seamless integration with your applications or workflows.

How does it works?
The API enables you to send a scraping request and receive the results directly at the request point. This eliminates the need for data retrieval or the need to send to external storage, streamlines your data collection process.

Limitations

  • For long collection operations the best practice is to use our tigger/ endpoing (In case the collection request is taking too long while using /scrape endpoint, you will get the
    snapshot ID, which you will use to download the data once ready)

What is a dataset id and where can I find it?

A Dataset ID is a unique identifier used in Web Scraper API requests. It’s included in the request URL to specify which particular Web Scraper you want to access. This ID ensures that your API call retrieves data from the correct scraper in our system. Here is how it is used: https://api.brightdata.com/datasets/v3/trigger?dataset_id=DATASET_ID_HERE

A dataset id will look like: gd_XXXXXXXXXXXXXXXXX For example: gd_l1viktl72bvl7bjuj0

You can find the exact dataset ID in the Web Scraper API page for the Scraper you are intrested in under the API Request Builder tab, it will be already inserted in the API request for you to copy.

Note: An id that looks like s_XXXXXXXXXXXXXXXXXXfor example: s_m7hm4et0141r2rhojq is not a dataset ID, it is a snapshot id - a snapshot is a collection of data that is collected from a single Web Scraper API request.

What is 'Discovery only' mode?

In Discovery-only mode, the results obtained during the discovery phase are returned as the final output and do not proceed to the PDP (Product Detail Page) stage.


For example, if an Amazon product discovery scraper is initiated in Discovery-only mode, it will return only the product URLs found during the discovery phase. When this mode is turned off, the scraper will continue to visit and extract data from each individual product page identified during discovery.

Assistant
Responses are generated using AI and may contain mistakes.

Was this page helpful?

linkedinyoutubegithub
Powered by Mintlify