Trigger collection
Bright Data Web Scraper API enables you to trigger a collection for automated web data extraction.
Related guide: Web Scraper API Introduction
Authorizations
Use your Bright Data API Key as a Bearer token in the Authorization header.
Get API Key from: https://brightdata.com/cp/setting/users.
Example:
Authorization: Bearer b5648e1096c6442f60a6c4bbbe73f8d2234d3d8324554bd6a7ec8f3f251f07df
Query Parameters
Dataset ID for which data collection is triggered. Read more about Dataset ID.
"gd_l1vikfnt1wgvvqz95w"
The "custom_output_fields" parameter is used to filter the response data to include only the specified fields. You can list the output columns you want, separated by a pipe (|
).
For example, if you want the response to include only the URL and the date it was last updated, you would set the parameter to "url|about.updated_on". This allows you to customize the data output to include only the fields relevant to your needs.
"url|about.updated_on"
Set it to "discover_new" to trigger a collection that includes a discovery phase.
Enables a discovery phase that finds new entities or products using methods like search, categories, or keywords. Use this when collecting data where specific targets aren't known in advance. It will discover new information based on your provided inputs rather than working with predefined data points.
discover_new
Specifies the method used for discovering new data during a collection. Here are some available options:
keyword
: Uses keywords to discover new entities or products. Example: "smartphones" - This will trigger a collection to discover new smartphone products or entities.best_sellers_url
: Uses a URL that lists best-selling items to discover new products. Example: "https://example.com/best-sellers" - This URL will be used to discover products listed as best sellers on the site.category_url
: Uses a URL that lists categories to discover new entities within those categories. Example: "https://example.com/electronics" - This URL will be used to discover new products within the electronics category.location
: Uses a location-based approach to discover entities relevant to that location. Example: "New York" - This will trigger a collection to discover data related to the specified location.
Include errors report with the results. By setting "include_errors" to true
, you will receive a detailed report of any errors that occur during the data collection.
true
Limit the number of results per input
x >= 1
Limit the total number of results
x >= 1
Specify whether notifications should be sent upon completion of the data collection job. When set to true
, it enables notifications to be sent to the specified webhook, informing you about the status or completion of the collection.
true
Specify the Webhook URL that should be called for the data collection process.
"https://example.com/webhook"
Specifies the format of the data to be delivered
json
, ndjson
, jsonl
, csv
"json"
Authorization header for webhook delivery
By default, the data will be sent compressed. Pass true to send it uncompressed
true
Body
You can provide the input data in either JSON or CSV format. The input specifies the URLs or other parameters required by the scraper.
An array of objects containing URLs or other parameters required by the scraper. The exact fields needed depend on the specific dataset being used.
Response
Collection job successfully started
The response is of type object
.