Skip to main content
Final Banner Pn

Installation and setup

Install the package via pip:
pip install brightdata-sdk

Configuration

You must provide your API token. You can find it in your Bright Data Control Panel. Option 1: Environment variable (recommended)
export BRIGHTDATA_API_TOKEN="your_api_token_here"
Option 2: Direct initialization
from brightdata import SyncBrightDataClient

client = SyncBrightDataClient(token="your_api_token_here")

Basic usage

Use SyncBrightDataClient for simple scripts. Use BrightDataClient with asyncio for high-concurrency workloads.
from brightdata import SyncBrightDataClient

with SyncBrightDataClient() as client:
    # Scrape a URL
    data = client.scrape_url("https://example.com")
    print(f"Result: {data.data}")

    # Search Google
    search = client.search.google(query="Bright Data")
    print(f"Found: {len(search.data)}")

Launch scrapes and web searches

Try these examples to use Bright Data’s SDK functions from your IDE
from brightdata import BrightDataClient

client = BrightDataClient()

# Google search
results = client.search.google(
    query="best shoes of 2025",
    location="United States",
    language="en",
    num_results=20
)

# Bing search
results = client.search.bing(
    query="python tutorial",
    location="United States"
)

# Yandex search
results = client.search.yandex(
    query="latest news",
    location="Germany"
)

if results.success:
    print(f"Cost: ${results.cost:.4f}")
    print(f"Time: {results.elapsed_ms():.2f}ms")
When working with multiple queries or URLs, requests are handled concurrently for optimal performance.

Use platform-specific scrapers for structured data

Extract structured data from popular platforms like Amazon, LinkedIn, ChatGPT, Facebook, and Instagram
from brightdata import BrightDataClient
from brightdata.payloads import AmazonProductPayload

client = BrightDataClient()

# Scrape Amazon product with type-safe payload
payload = AmazonProductPayload(
    url="https://amazon.com/dp/B0CRMZHDG8",
    reviews_count=50
)

result = client.scrape.amazon.products(**payload.to_dict())

if result.success:
    product = result.data[0]
    print(f"Title: {product['title']}")
    print(f"Price: ${product['final_price']}")
    print(f"Rating: {product['rating']}")

# Scrape reviews with filters
result = client.scrape.amazon.reviews(
    url="https://amazon.com/dp/B0CRMZHDG8",
    pastDays=30,
    keyWord="quality",
    numOfReviews=100
)

Datasets API

Access pre-collected data snapshots.
from brightdata import SyncBrightDataClient

with SyncBrightDataClient() as client:
    # 1. Request snapshot with filters
    print("Requesting snapshot...")
    snapshot_id = client.datasets.imdb_movies(
        filter={"name": "year", "operator": "=", "value": 2024},
        records_limit=10
    )

    # 2. Download (SDK polls automatically)
    print(f"Snapshot {snapshot_id} ready. Downloading...")
    data = client.datasets.imdb_movies.download(snapshot_id)
    print(f"Downloaded {len(data)} records.")
In your IDE, hover over the BrightDataClient class or any of its methods to view available parameters, type hints, and usage examples. The SDK provides full IntelliSense support!

Use dataclass payloads for type safety

The SDK includes dataclass payloads with runtime validation and helper properties
from brightdata import BrightDataClient
from brightdata.payloads import (
    AmazonProductPayload,
    LinkedInJobSearchPayload,
    ChatGPTPromptPayload
)

client = BrightDataClient()

# Amazon product with validation
amazon_payload = AmazonProductPayload(
    url="https://amazon.com/dp/B123456789",
    reviews_count=50  # Runtime validated!
)
print(f"ASIN: {amazon_payload.asin}")  # Helper property
print(f"Domain: {amazon_payload.domain}")

# LinkedIn job search
linkedin_payload = LinkedInJobSearchPayload(
    keyword="python developer",
    location="San Francisco",
    remote=True
)
print(f"Remote search: {linkedin_payload.is_remote_search}")

# Use with client
result = client.scrape.amazon.products(**amazon_payload.to_dict())

Connect to scraping browser

Use the SDK to easily connect to Bright Data’s scraping browser
from brightdata import BrightDataClient
from playwright.sync_api import Playwright, sync_playwright

client = BrightDataClient(
    token="your_api_token",
    browser_username="username-zone-browser_zone1",
    browser_password="your_password"
)

def scrape(playwright: Playwright, url='https://example.com'):
    browser = playwright.chromium.connect_over_cdp(client.connect_browser())
    try:
        print(f'Connected! Navigating to {url}...')
        page = browser.new_page()
        page.goto(url, timeout=2*60_000)
        print('Navigated! Scraping page content...')
        data = page.content()
        print(f'Scraped! Data length: {len(data)}')
    finally:
        browser.close()

def main():
    with sync_playwright() as playwright:
        scrape(playwright)

if __name__ == '__main__':
    main()

Use the CLI tool

The SDK includes a powerful command-line interface for terminal usage
# Search operations
brightdata search google "python tutorial" --location "United States"
brightdata search linkedin jobs --keyword "python developer" --remote

# Scrape operations
brightdata scrape amazon products "https://amazon.com/dp/B123"
brightdata scrape linkedin profiles "https://linkedin.com/in/johndoe"

# Generic web scraping
brightdata scrape generic "https://example.com" --output-format pretty

# Save results to file
brightdata search google "AI news" --output-file results.json

Async usage for better performance

For concurrent operations, use the async API:
import asyncio
from brightdata import BrightDataClient

async def scrape_multiple():
    # Use async context manager
    async with BrightDataClient() as client:
        # Scrape multiple URLs concurrently
        results = await client.scrape.generic.url_async([
            "https://example1.com",
            "https://example2.com",
            "https://example3.com"
        ])
        
        for result in results:
            if result.success:
                print(f"Success: {result.elapsed_ms():.2f}ms")

asyncio.run(scrape_multiple())
When using *_async methods, always use the async context manager (async with BrightDataClient() as client). SyncBrightDataClient handles this automatically.

Resources

GitHub repository

View source code, examples, and contribute

Examples directory

10+ working examples for all features

PyPI page

Package listing and release history

What’s new in v2.0.0

  • Runtime validation with helpful error messages
  • IDE autocomplete support
  • Helper properties (.asin, .is_remote_search, .domain)
  • Consistent with result models
  • brightdata command for terminal usage
  • Scrape and search operations
  • Multiple output formats (JSON, pretty, minimal)
  • File output support
  • 5 comprehensive notebooks
  • Pandas integration examples
  • Data analysis workflows
  • Batch processing guides
  • Facebook scraper (posts, comments, reels)
  • Instagram scraper (profiles, posts, comments, reels)
  • Instagram search (posts and reels discovery)
  • Single shared AsyncEngine (8x efficiency)
  • Reduced memory footprint
  • Better resource management
  • 502+ comprehensive tests