To access this API, you will need a Bright Data API key
Run a Search
To initiate a search of our Archive, use the following/search endpoint.
Endpoint: POST api.brightdata.com/webarchive/search
If the search takes longer than 30 seconds, the response returns only a
search_id and you should poll the status asynchronously. If the search completes within 30 seconds, the response returns the full search result object (same as GET /webarchive/search/<search_id>).- Request
- Response
- Dictionary
You can run up to 100 searches per day without triggering a dump.
Once you trigger a dump, that search no longer count against your limit.
LIKE vs Regex Filters: Use LIKE filters (
domain_like_*, url_like_*) for simple pattern matching with % (any sequence) and _ (single character). LIKE patterns are case-insensitive and often faster than regex for simple prefix/suffix matching like %.com or amazon%. Use regex filters (domain_regex_*, url_regex_*) for complex patterns requiring full regex syntax. LIKE patterns use backslash escaping: \% for literal %, \_ for literal _.Get Search Status
To check the status of a specific query that was made. Endpoint:GET api.brightdata.com/webarchive/search/<search_id>
When successful it will retrieve:
- The number of entries for your query
- The estimated size and cost of the full Data Snapshot
Pricing & size:
estimate_batch_size is measured in bytes. dump_cost_usd is an estimated total cost based on files_count and your current cache/archive pricing tier. The cost_breakdown object shows separate costs for cache vs archive pages.- Request
- Response
- Response Dictionary
Get All Search Statuses
Check the status of all current searches. Endpoint:GET api.brightdata.com/webarchive/searches
- Request
- Response
How data range affects delivery time
If your query is matching data within last 24 hours - your snapshot will start processing/delivering immediately. If some of your matched data is older than 24 hours - it needs to be retrieved from S3 Glacier Deep Archive storage tier before delivery, which may take up to 72 hours.Starting February 9, 2026, the hot storage retention period changed from 72 hours to 24 hours.
We recommend using
max_age = 12h for initial testing to ensure fast delivery.Warning: Avoid queries that span the retention boundary (approximately 24 hours from now).
Requests with
max_age or time ranges that fall within ~24h ± 2h of the current time may include files
that have already been migrated to archive storage tier. Attempting a dump for such
queries can cause the dump to stall or remain incomplete because of files storage class transition.Recommendations:- For real-time data needs: use
max_age: "12h"or a narrower window to avoid the retention edge. - For historical data (older than 24h): use explicit
min_date/max_datefilters rather thanmax_age. - If a dump appears stalled: we usually retry automatically, please open a ticket if it didn’t happen.
Deliver Snapshot to Amazon S3 Storage
To use S3 storage delivery, you will first need to do the following:
- Create an AWS role which gives Bright Data access to your system.
- During this setup, you will be asked by Amazon for an “external ID” that is used with the role.
- Your external ID for S3 is your Bright Data Account ID that can be found within Account Settings
- Once a role is created, you will need to allow our system delivery role to
AssumeRolethat role.- Our system delivery role is:
arn:aws:iam::422310177405:role/brd.ec2.zs-dca-delivery
- Our system delivery role is:
search_id to S3 storage, use the following /dump endpoint.
Endpoint: POST api.brightdata.com/webarchive/dump
Common dump parameters:
search_id(required): The search ID from a completed searchmax_entries(optional): Limit the number of files to include in the dumpdelivery(required): Delivery configuration (S3, Azure, or webhook)
- Request
- Response
Deliver Snapshot to Azure Blob Storage
Deliver a specific Snapshot from a specificsearch_id directly into an Azure Blob Storage container using the same /dump endpoint.
Endpoint: POST api.brightdata.com/webarchive/dump
- Request
- Response
Collect Snapshot via Webhook
Collect a Data Snapshot via webhook from a specificsearch_id
Endpoint: POST api.brightdata.com/webarchive/dump
- Request
- Response
Get Status of Data Snapshot
Check the status of a specific Data Snapshot (dump) using the dump_id. Endpoint:GET api.brightdata.com/webarchive/dump/<dump_id>
- Request
- Response
- Response Dictionary
Get the Status of all Data Snapshots
Endpoint:GET api.brightdata.com/webarchive/dumps
- Request
- Response
High-level process flow diagram
