Endpoint: POST /datasets/v3/trigger

Creates a request for data collection.

Request

dataset_id
string
required

Dataset ID for which data collection is triggered. You can see our available datasets here.

Example: dataset_id=gd_l1vikfnt1wgvvqz95w

type
string

If you want to trigger a collection that includes a Discovery phase, you should pass discover_new as the type. type=discover_new should always be sent when discover_by is provided.

Example: type=discover_new

discover_by
string
Relevant ONLY for discovery type APIs - e.g. type=discover_new

Specifies which discovery method to use, discovery types can be

Example: discover_by=keyword

Available options:
keyword, best_sellers_url,category_url, location and more (according to the specific API)

Limit multiple results
integer

Limit results per input when a collection includes a discovery phase

Example: discover by keywords - limit to 10 results per keyword

Include errors report with the results
string

Ensure the output includes errors report for easier troubleshooting.

Example: include_errors=true

notify
url

URL where the notification will be sent once the collection is finished. Notification will contain snapshot_id and status.

Example: notify=https://notify-me.com/

auth_header
string

Authorization header to be used when sending notification to notify URL or delivering data via webhook endpoint

Example: auth_header=QWxhZGRpbjpPcGVuU2VzYW1l

endpoint
url

Webhook URL where data will be delivered.

Example: endpoint=https://webhook-url.com

format
enum

Specifies the format of the data to be delivered to the webhook endpoint.

Supported formats: JSON, NDJSON, JSONL, CSV
Example: format=json

uncompressed_webhook
boolean
default: "false"

By default, the data will be sent to the webhook compressed. pass true to send it uncompressed.

Example: uncompressed_webhook=true

Additional delivery methods: You can use the snapshot_id returned from this API call to trigger a delivery to a specific storage (Amazon S3, Microsoft Azure, etc.) via the delivery API, or use the download API to download it directly.

Body

The inputs to be used by the scraper. Can be provided either as JSON or as a CSV file:

Content-Type
string

Content-Type: application/json

A JSON array of inputs

Example: [{"url":"https://www.airbnb.com/rooms/50122531"}]


Content-Type: multipart/form-data

A CSV file, in a field called data

Example (curl): data=@path/to/your/file.csv

To learn more about scraper inputs, click here

Web Scraper Types

Each scraper can require different inputs. There are 2 main types of scrapers:

1. PDP

These scrapers require URLs as inputs. A PDP scraper extracts detailed product information like specifications, pricing, and features from web pages

2. Discovery

Discovery scrapers allow you to explore and find new entities/products through search, categories, Keywords and more.

Request examples

PDP with URL input

Input format for PDP is always a URL, pointing to the page to be scraped.

Sample Request
curl -H "Authorization: Bearer API_TOKEN" -H "Content-Type: application/json" -d '[{"url":"https://www.airbnb.com/rooms/50122531"},{"url":"https://www.airbnb.com/rooms/50127677"}]' "https://api.brightdata.com/datasets/v3/trigger?dataset_id=gd_ld7ll037kqy322v05&format=json&uncompressed_webhook=true"

Discovery input based on the discovery method

Sample Request
curl -H "Authorization: Bearer x2x3fdaaddrer" -H "Content-Type: application/json" -d '[{"keyword":"light bulb"},{"keyword":"dog toys"},{"keyword":"home decor"}]' "https://api.brightdata.com/datasets/v3/trigger?dataset_id=gd_l7q7dkf244hwjntr0&endpoint=https://webhook-url.com&auth_header=QWxhZGRpbjpPcGVuU2VzYW1l&notify=https://notify-me.com/&format=ndjson&uncompressed_webhook=true&type=discover_new&discover_by=keyword&limit_per_input=10"

Input format for discovery can vary according to the specific scraper. Inputs can be:

And more. Find out what inputs each scraper requires here.

Returns

Object containing snapshot_id, which represents the ID of your request and can be used in the next APIs.

Sample Response
{"snapshot_id": "s_lynh132v19n82v81kx"}