Learn how to use our Archive API for accessing and retrieving data snapshots from Bright Data’s cache, with delivery options to Amazon S3 or via webhook.
Our Archive API allows you to access and retrieve Data Snapshots from Bright Data’s cached data collections in a seamless and efficient method.
To access this API, you will need a Bright Data API key
If your query is matching data within last 72h - your snapshot will start processing/delivering immediately.If some of your matched data is older than 72h - it needs to be retrieved from a colder archive before delivery and it may take up to 72h.
We recommend using max_age = 1d for initial testing.
To use S3 storage delivery, you will first need to do the following:
Create an AWS role which gives Bright Data access to your system.
During this setup, you will be asked by Amazon for an “external ID” that is used with the role.
Your external ID for S3 is your Bright Data Account ID that can be found within Account Settings
Once a role is created, you will need to allow our system delivery role to AssumeRole that role.
Our system delivery role is: arn:aws:iam::422310177405:role/brd.ec2.zs-dca-delivery
To deliver a specific Snapshot from a specific search_id to S3 storage, use the following /dump endpoint.
Endpoint: POST api.brightdata.com/webarchive/dump
Request
Response
Code Example
Copy
POST api.brightdata.com/webarchive/dump{ search_id: <search_id>, max_entries?: 1000000, // (optional) limit how many files you purchase delivery: { strategy: 's3', // also supports 'azure' and 'webhook' settings: { bucket: <your_bucket_name>, prefix: <your_custom_prefix>, // (optional) Customize top-level export folder assume_role: { role_arn: <role_you_created_above>, }, }, },}
Deliver a specific Snapshot from a specific search_id directly into an Azure Blob Storage container using the same /dump endpoint.
Endpoint: POST api.brightdata.com/webarchive/dump
Request
Response
Code Example
Copy
POST api.brightdata.com/webarchive/dump{ search_id: <search_id>, max_entries?: 1000000, // (optional) limit how many files you purchase delivery: { strategy: 'azure', settings: { container: <your_container>, prefix: <your_custom_prefix>, // (optional) customize top-level export folder credentials: { account: <your_account_name>, key: <your_account_key>, // use a key with write permission to the container }, }, },}