Skip to main content
POST
/
datasets
/
filter
curl --request POST \
  --url 'https://api.brightdata.com/datasets/filter?dataset_id=gd_l1vikfnt1wgvvqz95w' \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{"filter": {"name": "url", "operator": "=", "value": "https://www.instagram.com/natgeo/"}, "records_limit": 10}'
{
  "snapshot_id": "<string>"
}
The Filter endpoint of the Bright Data Marketplace Dataset API runs a large or file-based filter job against any of 250+ Marketplace datasets and returns a snapshot_id you can download once the job completes.
Paste your API key into the authorization field. To get an API key, create an account and learn how to generate an API key.

When should I use Filter?

Use Filter for bulk or file-driven jobs where asynchronous processing is acceptable:
  • Bulk exports of more than 1,000 records.
  • Filtering against large value lists from CSV or JSON files, such as excluding 100k+ company IDs.
  • Datasets not yet supported by Search.
  • Scheduled or background pipelines where async is fine.
For sub-second real-time lookups on supported datasets, use Search instead.

How does Filter work?

  • A call to the Filter endpoint starts an async job and creates a snapshot of the filtered data in your account.
  • The maximum job time is 5 minutes. Jobs that run longer are cancelled.
  • Charges apply per record in the snapshot, at the standard Marketplace rate of $2.5 CPM.
  • Filter works on all 250+ Marketplace datasets.
  • Filter groups support a maximum nesting depth of 3 levels.

How do I authenticate?

Filter uses Bearer token authentication. Pass your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Get your key from account settings.

Limits

LimitValueDescription
Max rows per file10,000Each uploaded CSV/JSON file can contain up to 10,000 data rows. The header row is not counted.
Max files per requestNo limitAttach as many files as needed in one multipart request, as long as the total stays within the 200 MiB cap.
Max request size200 MiBTotal size of all uploaded files and form data combined. Requests over 200 MiB are rejected.
Job timeout5 minutesIf filtering does not complete within 5 minutes the job is cancelled.
Filter nesting depth3 levelsMaximum depth for nested filter groups using and/or.
Max parallel jobs100 per datasetUp to 100 Filter jobs can run at once per dataset.
Rate limit120 requests/hourMaximum number of Filter API calls per hour.

How do I call Filter?

Filter has two modes: JSON for plain filters and multipart for file uploads.

JSON mode (no file uploads)

Send all parameters (dataset_id, records_limit and filter) in the JSON body. Set Content-Type to application/json:
curl -X POST "https://api.brightdata.com/datasets/filter" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_id": "gd_l1viktl72bvl7bjuj0",
    "records_limit": 100,
    "filter": {
      "name": "name",
      "operator": "=",
      "value": "John"
    }
  }'
Filter returns a snapshot_id:
{ "snapshot_id": "s_abc123..." }

Multipart mode (file uploads)

Send dataset_id and records_limit as query parameters, and send filter and the uploaded files in the form-data body. Set Content-Type to multipart/form-data:
curl -X POST "https://api.brightdata.com/datasets/filter?dataset_id=gd_l1vijqt9jfj7olije&records_limit=100" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F 'filter={"operator":"and","filters":[{"name":"industries:value","operator":"includes","value":"industries.csv"}]}' \
  -F 'files[]=@/path/to/industries.csv'
To exclude 100k+ values, split them into files of up to 10,000 rows each and attach them all in a single request:
curl -X POST "https://api.brightdata.com/datasets/filter?dataset_id=gd_l1vijqt9jfj7olije&records_limit=5000" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F 'filter={"operator":"and","filters":[{"name":"company_id","operator":"not_in","value":"exclude1.csv"},{"name":"company_id","operator":"not_in","value":"exclude2.csv"},{"name":"company_id","operator":"not_in","value":"exclude3.csv"}]}' \
  -F 'files[]=@exclude1.csv' \
  -F 'files[]=@exclude2.csv' \
  -F 'files[]=@exclude3.csv'
For CSV and JSON file format rules, file references and upload troubleshooting, see Filter dataset with CSV/JSON files.

What does Filter return?

Filter returns a snapshot_id. Use it to download the filtered records via the snapshot API once the job completes:

How much does Filter cost?

Filter costs $2.5 CPM (per 1,000 records returned), the same rate as the Marketplace. There is no charge when the filter returns 0 records.

What errors can Filter return?

StatusMeaningWhat to do
400Bad filter or paramsCheck field names with Get dataset metadata.
401Bad or missing API keyCheck your Bearer token.
402Not enough fundsTop up your balance or reduce records_limit.
404Unknown dataset_idConfirm the dataset ID.
422Filter matched 0 recordsLoosen your filter or check field values.
429Too many parallel jobs (max 100 per dataset) or rate limit hit (120 requests/hour)Back off and retry.

Filter syntax

The filter object, its operators, filter groups and nesting rules are shared with the Search endpoint and documented in one place. See the filter syntax reference for the full operator list, filter groups, up to three levels of nesting and CSV/JSON file references.

Authorizations

Authorization
string
header
required

Use your Bright Data API Key as a Bearer token in the Authorization header.

How to authenticate:

  1. Obtain your API Key from the Bright Data account settings at https://brightdata.com/cp/setting/users
  2. Include the API Key in the Authorization header of your requests
  3. Format: Authorization: Bearer YOUR_API_KEY

Example:

Authorization: Bearer b5648e1096c6442f60a6c4bbbe73f8d2234d3d8324554bd6a7ec8f3f251f07df

Learn how to get your Bright Data API key: https://docs.brightdata.com/api-reference/authentication

Query Parameters

dataset_id
string

ID of the dataset to filter (required in multipart/form-data mode)

Example:

"gd_l1viktl72bvl7bjuj0"

records_limit
integer

Limit the number of records to be included in the snapshot

Example:

1000

Body

dataset_id
string
required

ID of the dataset to filter

Example:

"gd_l1viktl72bvl7bjuj0"

filter
Single field filter · object
required
Example:
{
"name": "name",
"operator": "=",
"value": "John"
}
records_limit
integer

Limit the number of records to be included in the snapshot

Example:

1000

Response

Job of creating the snapshot successfully started

snapshot_id
string

ID of the snapshot