POST
/
datasets
/
v3
/
trigger

Body

The inputs to be used by the scraper. Can be provided either as JSON or as a CSV file:

Content-Type
string

Content-Type: application/json

A JSON array of inputs

Example: [{"url":"https://www.airbnb.com/rooms/50122531"}]


Content-Type: multipart/form-data

A CSV file, in a field called data

Example (curl): data=@path/to/your/file.csv

Web Scraper Types

Each scraper can require different inputs. There are 2 main types of scrapers:

1. PDP

These scrapers require URLs as inputs. A PDP scraper extracts detailed product information like specifications, pricing, and features from web pages

2. Discovery

Discovery scrapers allow you to explore and find new entities/products through search, categories, Keywords and more.

Request examples

PDP with URL input

Input format for PDP is always a URL, pointing to the page to be scraped.

Sample Request
curl -H "Authorization: Bearer API_TOKEN" -H "Content-Type: application/json" -d '[{"url":"https://www.airbnb.com/rooms/50122531"},{"url":"https://www.airbnb.com/rooms/50127677"}]' "https://api.brightdata.com/datasets/v3/trigger?dataset_id=gd_ld7ll037kqy322v05&format=json&uncompressed_webhook=true"

Discovery input based on the discovery method

Sample Request
curl -H "Authorization: Bearer x2x3fdaaddrer" -H "Content-Type: application/json" -d '[{"keyword":"light bulb"},{"keyword":"dog toys"},{"keyword":"home decor"}]' "https://api.brightdata.com/datasets/v3/trigger?dataset_id=gd_l7q7dkf244hwjntr0&endpoint=https://webhook-url.com&auth_header=QWxhZGRpbjpPcGVuU2VzYW1l&notify=https://notify-me.com/&format=ndjson&uncompressed_webhook=true&type=discover_new&discover_by=keyword&limit_per_input=10"

Input format for discovery can vary according to the specific scraper. Inputs can be:

And more. Find out what inputs each scraper requires here.

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

dataset_id
string
required

Dataset ID for which data collection is triggered.

type
enum<string>

Set it to "discover_new" to trigger a collection that includes a discovery phase.

Available options:
discover_new
discover_by
string

Specifies which discovery method to use. Available options: "keyword", "best_sellers_url", "category_url", "location" and more (according to the specific API). Relevant only for collections that include a discovery phase.

include_errors
boolean

Include errors report with the results.

limit_per_input
number

Limit the number of results per input. Relevant only for collections that include a discovery phase.

Required range: x > 1
limit_multiple_results
number

Limit the total number of results.

Required range: x > 1
notify
string

URL where the notification will be sent once the collection is finished. Notification will contain snapshot_id and status.

endpoint
string

Webhook URL where data will be delivered.

format
enum<string>

Specifies the format of the data to be delivered to the webhook endpoint.

Available options:
json,
ndjson,
jsonl,
csv
auth_header
string

Authorization header to be used when sending notification to notify URL or delivering data via webhook endpoint.

uncompressed_webhook
boolean

By default, the data will be sent to the webhook compressed. Pass true to send it uncompressed.

Body

{key}
any

Response

200 - application/json
snapshot_id
string

ID of your request that can be used in the next APIs