Bright Data Scraper Studio API quickstart

This guide walks you through sending your first request to the Bright Data Scraper Studio API. By the end, you will trigger a published collector from your own code and receive structured JSON back. The Bright Data Scraper Studio API is built around two HTTP calls:

POST /dca/trigger, which queues one or more inputs and returns a snapshot ID.
GET /dca/dataset?id=<snapshot_id>, which serves the snapshot once it is ready.

If you do not yet have a published collector, build one first with the AI Agent or the IDE.

Typical time to first record is about three minutes for a collector with one to ten inputs.

Prerequisites

An active Bright Data account with a payment method on file
An API token from Account Settings → API Tokens
A Collector ID from Scraper Studio; the ID starts with c_

Set both values as environment variables once and reuse them across every snippet below:

export BRIGHT_DATA_API_TOKEN="your_api_token_here"
export BRIGHT_DATA_COLLECTOR_ID="c_xxxxxxxxxxxxxxxx"

Make your first request

Authenticate every call

Every Bright Data Scraper Studio API call uses bearer-token authentication. Add this header to every request:

Authorization: Bearer YOUR_BRIGHT_DATA_API_TOKEN

A missing or invalid token returns 401 Unauthorized.

Trigger your collector

Send the inputs you want the collector to process as a JSON array in the request body. Each object in the array must match the input schema you defined when you built the collector. The default schema is a single url field.

curl -X POST \
  "https://api.brightdata.com/dca/trigger?collector=$BRIGHT_DATA_COLLECTOR_ID&queue_next=1" \
  -H "Authorization: Bearer $BRIGHT_DATA_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '[
    {"url": "https://ecommerce-shop-brd.vercel.app/product/echo-portable-speaker"},
    {"url": "https://ecommerce-shop-brd.vercel.app/product/nimbus-cloud-storage"}
  ]'

The Bright Data Scraper Studio API responds with a single snapshot ID:

{
  "collection_id": "j_abc123def456"
}

Keep this ID. You will use it as the snapshot_id in step 3.

queue_next=1 runs your inputs immediately. Omit it (or set 0) to enqueue them behind any in-flight work for the same collector.

Poll until ready, then download

The same /dca/dataset endpoint serves both the in-progress and ready responses. Poll it every five seconds until the response is a JSON array.

# Poll every 5 seconds until the response is a JSON array (not an object).
while :; do
  response=$(curl -s \
    "https://api.brightdata.com/dca/dataset?id=$SNAPSHOT_ID" \
    -H "Authorization: Bearer $BRIGHT_DATA_API_TOKEN")
  if [[ "${response:0:1}" == "[" ]]; then
    echo "$response" > results.json
    break
  fi
  sleep 5
done

While the snapshot is building, the response is a status object:

{ "status": "building" }

When the snapshot is ready, the response is a JSON array. One row per successful input by default:

[
  {
    "url": "https://ecommerce-shop-brd.vercel.app/product/echo-portable-speaker",
    "title": "Echo Portable Speaker",
    "price": 49.99,
    "availability": "in stock",
    "input": { "url": "https://ecommerce-shop-brd.vercel.app/product/echo-portable-speaker" }
  },
  {
    "url": "https://ecommerce-shop-brd.vercel.app/product/nimbus-cloud-storage",
    "title": "Nimbus Cloud Storage",
    "price": 12.99,
    "availability": "in stock",
    "input": { "url": "https://ecommerce-shop-brd.vercel.app/product/nimbus-cloud-storage" }
  }
]

The exact field set depends on the output schema you defined when you built the collector.

How long does this take?

The first record usually arrives within a minute, but total time depends on the collector and the target site. Measured against a typical e-commerce product page collector:

Input count	Typical wall-clock time
1 to 10 URLs	30 to 90 seconds
11 to 100 URLs	2 to 5 minutes
100+ URLs	5+ minutes. Use push delivery instead of polling.

For long-running jobs, swap polling for a push delivery destination (webhook, S3, GCS, Azure, SFTP or email) so Bright Data calls you when the snapshot is ready.

What do the IDs mean?

Three identifiers appear in Bright Data Scraper Studio. They are easy to confuse because the trigger response uses one name for a value that another endpoint reads under a different name.

Term	Looks like	What it identifies
Collector ID	`c_xxxxxxxxxxxxxxxx`	The published scraper definition. Stable. You pass it as the `collector` query parameter on `/dca/trigger`.
Collection ID (returned as `collection_id`)	`j_xxxxxxxxxxxxxxxx`	One run of the collector. The trigger response field is `collection_id`, but every other endpoint refers to the same value as `snapshot_id`. They are the same string.
Dataset	a JSON array	The result rows produced by one run. The `/dca/dataset` endpoint returns this when the run is finished.

Treat collection_id from the trigger response as your snapshot_id everywhere else. They are the same value under two names.

What errors might I see?

Status	Meaning	Fix
`401 Unauthorized`	Token missing, malformed or revoked	Re-copy from Account Settings → API Tokens
`404 Not Found`	Collector ID does not exist or your account does not have access	Open the collector in Scraper Studio and re-copy the ID
`422 Unprocessable Entity`	The objects in your request body do not match the collector’s input schema	Confirm field names against the Inputs tab of your collector
`5xx`	Transient Bright Data API error	Retry with exponential backoff, for example 1s, 2s, 4s
`[]` (empty array)	Snapshot has no rows, or the snapshot expired	Snapshots are retained for 90 days by default. See Specifications

Use a production-grade starter template

These open-source repositories are exactly the calls above, hardened with environment-variable config, retry/backoff for transient failures, library helpers and a complete README. Fork either and you have a runnable client in 30 seconds.

Node.js starter

Node 18+, ES modules, dotenv, retry/backoff, ~150 LOC

Python starter

Python 3.8+, requests, python-dotenv, retry/backoff, ~150 LOC

Both repositories ship with a CodeSandbox devcontainer so you can fork and run in your browser without any local setup.

Next steps

Choose a delivery type

Skip polling. Have Bright Data push results to a webhook, S3, GCS or email when the snapshot is ready.

Trigger a batch collection

Send hundreds or thousands of inputs in a single request and receive results in batches.

Run a synchronous real-time job

For low-input, latency-sensitive workloads. Trigger and receive results in a single HTTP call.

Build a new collector

Need a collector that does not exist yet? Build one with the AI Agent or the IDE.

Frequently asked questions

What is the difference between the Collection API and the AI Flow API?

The Collection API (/dca/*, this page) runs an existing collector to get data. The AI Flow API runs the AI Agent to create or self-heal a collector. Most developers integrating Bright Data Scraper Studio into a product use the Collection API.

Can I send different input shapes in the same request?

Yes, as long as every object in the array conforms to the collector’s input schema. If your collector accepts both url and keyword as input fields, you can mix them in one request. Fields you do not include are treated as null.

How do I retry failed inputs without re-running the successful ones?

Open the snapshot in My Scrapers and click Last errors to see which inputs failed and why. Re-trigger just those inputs in a new POST /dca/trigger call.

Is there a rate limit?

Yes. Per-account concurrency limits apply per collector. See Specifications for current limits. The starter templates linked above already implement exponential backoff for transient 5xx responses.

Introduction

Product Guides

Bright Data Scraper Studio API quickstart

Prerequisites

Make your first request

How long does this take?

What do the IDs mean?

What errors might I see?

Use a production-grade starter template

Node.js starter

Python starter

Next steps

Choose a delivery type

Trigger a batch collection

Run a synchronous real-time job

Build a new collector

Frequently asked questions

Introduction

Product Guides

Documentation Index

​Prerequisites

​Make your first request

​How long does this take?

​What do the IDs mean?

​What errors might I see?

​Use a production-grade starter template

Node.js starter

Python starter

​Next steps

Choose a delivery type

Trigger a batch collection

Run a synchronous real-time job

Build a new collector

​Frequently asked questions

Prerequisites

Make your first request

How long does this take?

What do the IDs mean?

What errors might I see?

Use a production-grade starter template

Next steps

Frequently asked questions