Filter Dataset (BETA)
Run async filter jobs on 250+ Bright Data Marketplace datasets. Returns a snapshot_id to download, with CSV/JSON uploads up to 200 MiB.
snapshot_id you can download once the job completes.
When should I use Filter?
Use Filter for bulk or file-driven jobs where asynchronous processing is acceptable:- Bulk exports of more than 1,000 records.
- Filtering against large value lists from CSV or JSON files, such as excluding 100k+ company IDs.
- Datasets not yet supported by Search.
- Scheduled or background pipelines where async is fine.
How does Filter work?
- A call to the Filter endpoint starts an async job and creates a snapshot of the filtered data in your account.
- The maximum job time is 5 minutes. Jobs that run longer are cancelled.
- Charges apply per record in the snapshot, at the standard Marketplace rate of $2.5 CPM.
- Filter works on all 250+ Marketplace datasets.
- Filter groups support a maximum nesting depth of 3 levels.
How do I authenticate?
Filter uses Bearer token authentication. Pass your API key in theAuthorization header:
Limits
| Limit | Value | Description |
|---|---|---|
| Max rows per file | 10,000 | Each uploaded CSV/JSON file can contain up to 10,000 data rows. The header row is not counted. |
| Max files per request | No limit | Attach as many files as needed in one multipart request, as long as the total stays within the 200 MiB cap. |
| Max request size | 200 MiB | Total size of all uploaded files and form data combined. Requests over 200 MiB are rejected. |
| Job timeout | 5 minutes | If filtering does not complete within 5 minutes the job is cancelled. |
| Filter nesting depth | 3 levels | Maximum depth for nested filter groups using and/or. |
| Max parallel jobs | 100 per dataset | Up to 100 Filter jobs can run at once per dataset. |
| Rate limit | 120 requests/hour | Maximum number of Filter API calls per hour. |
How do I call Filter?
Filter has two modes: JSON for plain filters and multipart for file uploads.JSON mode (no file uploads)
Send all parameters (dataset_id, records_limit and filter) in the JSON body. Set Content-Type to application/json:
snapshot_id:
Multipart mode (file uploads)
Senddataset_id and records_limit as query parameters, and send filter and the uploaded files in the form-data body. Set Content-Type to multipart/form-data:
What does Filter return?
Filter returns asnapshot_id. Use it to download the filtered records via the snapshot API once the job completes:
How much does Filter cost?
Filter costs $2.5 CPM (per 1,000 records returned), the same rate as the Marketplace. There is no charge when the filter returns 0 records.What errors can Filter return?
| Status | Meaning | What to do |
|---|---|---|
400 | Bad filter or params | Check field names with Get dataset metadata. |
401 | Bad or missing API key | Check your Bearer token. |
402 | Not enough funds | Top up your balance or reduce records_limit. |
404 | Unknown dataset_id | Confirm the dataset ID. |
422 | Filter matched 0 records | Loosen your filter or check field values. |
429 | Too many parallel jobs (max 100 per dataset) or rate limit hit (120 requests/hour) | Back off and retry. |
Filter syntax
Thefilter object, its operators, filter groups and nesting rules are shared with the Search endpoint and documented in one place. See the filter syntax reference for the full operator list, filter groups, up to three levels of nesting and CSV/JSON file references.
Related
- Dataset API overview
- Search dataset (sync)
- Filter syntax reference
- Filter dataset with CSV/JSON files
- Get dataset metadata
Authorizations
Use your Bright Data API Key as a Bearer token in the Authorization header.
How to authenticate:
- Obtain your API Key from the Bright Data account settings at https://brightdata.com/cp/setting/users
- Include the API Key in the Authorization header of your requests
- Format:
Authorization: Bearer YOUR_API_KEY
Example:
Authorization: Bearer b5648e1096c6442f60a6c4bbbe73f8d2234d3d8324554bd6a7ec8f3f251f07dfLearn how to get your Bright Data API key: https://docs.brightdata.com/api-reference/authentication
Query Parameters
ID of the dataset to filter (required in multipart/form-data mode)
"gd_l1viktl72bvl7bjuj0"
Limit the number of records to be included in the snapshot
1000
Body
ID of the dataset to filter
"gd_l1viktl72bvl7bjuj0"
- Single field filter
- Filters group
- Single field filter w/out value
{
"name": "name",
"operator": "=",
"value": "John"
}Limit the number of records to be included in the snapshot
1000
Response
Job of creating the snapshot successfully started
ID of the snapshot