POST
/
datasets
/
filter
cURL
curl --request POST \
  --url https://api.brightdata.com/datasets/filter \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "dataset_id": "gd_l1viktl72bvl7bjuj0",
  "records_limit": 1000,
  "filter": {
    "name": "name",
    "operator": "=",
    "value": "John"
  }
}'
{
  "snapshot_id": "<string>"
}
Paste your API key to the authorization field. To get an API key, Create an account and learn how to generate an API key

General Description

  • A call to this endpoint starts the async job of filtering the dataset and creating a snapshot with filtered data in your account.
  • The maximum amount of time for the job to finish is 5 minutes. If the job doesn’t finish in this timeframe it will be cancelled.
  • Creating the dataset snapshot is subject to charges based on the snapshot size and record price.
  • The maximum depth of nesting the filter groups is 3.

Modes of Use

1. JSON Mode (No File Uploads)

Use this when you are not uploading any files.
  • All parameters (dataset_id, records_limit, and filter) are sent in the JSON request body.
  • Content-Type must be application/json.
  • No query parameters are used.
Example
curl --request POST \
  --url https://api.brightdata.com/datasets/filter \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "dataset_id": "gd_l1viktl72bvl7bjuj0",
    "records_limit": 100,
    "filter": {
      "name": "name",
      "operator": "=",
      "value": "John"
    }
  }'

2. Multipart/Form-Data Mode (File Uploads)

Use this when uploading CSV or JSON files containing filter values.
  • dataset_id and records_limit must be sent as query parameters in the URL.
  • The filter and any uploaded files are included in the form-data body.
  • Content-Type must be multipart/form-data.
Example
curl --request POST \
  --url "https://api.brightdata.com/datasets/filter?dataset_id=gd_l1vijqt9jfj7olije&records_limit=100" \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'filter={"operator":"and","filters":[{"name":"industries:value","operator":"includes","value":"industries.csv"}]}' \
  --form 'files[]=@/path/to/industries.csv'

Filter Syntax

Operators

The following table shows operators that can be used in the field filter.
OperatorField TypesDescription
=AnyEqual to
!=AnyNot equal to
<Number, DateLower than
<=Number, DateLower than or equal
>Number, DateGreater than
>=Number, DateGreater than or equal
inAnyTests if field value is equal to any of the values provided in filter’s value
not_inAnyTests if field value is not equal to all of the values provided in filter’s value
includesArray, TextTests if the field value contains the filter value. If the filter value is a single string, it matches records where the field value contains that string. If the filter value is an array of strings, it matches records where the field value contains a least one string from the array.
not_includesArray, TextTests if the field value does not contain the filter value. If the filter value is a single string, it matches records where the field value does not contain that string. If the filter value is an array of strings, it matches records where the field value does not contain any of the strings from the array.
array_includesArrayTests if filter value is in field value (exact match)
not_array_includesArrayTests if filter value is not in field value (exact match)
is_nullAnyTests if the field value is equal to NULL. Operator does not accept any value.
is_not_nullAnyTests if the field value is not equal to NULL. Operator does not accept any value.

Combining Multiple Filters

Multiple field filters can be combined into the filter group using 2 logical operators: ‘and’, ‘or’. API supports filters with a maximum nesting depth of 3. Example of filter group:
{
    // operator can be one of ["and", "or"]
    "operator": "and",
    // an array of field filters
    "filters": [
        {
            "name": "reviews_count",
            "opeartor": ">",
            "value": "200"
        },
        {
            "name": "rating",
            "operator": ">",
            "value": "4.5"
        }
    ]
}

Authorizations

Authorization
string
header
required

Use your Bright Data API Key as a Bearer token in the Authorization header.

Get API Key from: https://brightdata.com/cp/setting/users.

Example: Authorization: Bearer b5648e1096c6442f60a6c4bbbe73f8d2234d3d8324554bd6a7ec8f3f251f07df

Query Parameters

dataset_id
string

ID of the dataset to filter (required in multipart/form-data mode)

Example:

"gd_l1viktl72bvl7bjuj0"

records_limit
integer

Limit the number of records to be included in the snapshot

Example:

1000

Body

dataset_id
string
required

ID of the dataset to filter

Example:

"gd_l1viktl72bvl7bjuj0"

filter
object
required
Example:
{
"name": "name",
"operator": "=",
"value": "John"
}
records_limit
integer

Limit the number of records to be included in the snapshot

Example:

1000

Response

Job of creating the snapshot successfully started

snapshot_id
string

ID of the snapshot