Below are examples of Browser API usage in various scenarios and libraries.
Please make sure to install required libraries before continuing

Make your first request in minutes

Test the Browser API in minutes with these ready-to-use code examples.
Simple scraping of targeted page

Select your pefered tech-stack

#!/usr/bin/env node
const playwright = require('playwright');
const {
    AUTH = 'SBR_ZONE_FULL_USERNAME:SBR_ZONE_PASSWORD',
    TARGET_URL = 'https://example.com',
} = process.env;

async function scrape(url = TARGET_URL) {
    if (AUTH == 'SBR_ZONE_FULL_USERNAME:SBR_ZONE_PASSWORD') {
        throw new Error(`Provide Browser API credentials in AUTH`
            + ` environment variable or update the script.`);
    }
    console.log(`Connecting to Browser...`);
    const endpointURL = `wss://${AUTH}@brd.superproxy.io:9222`;
    const browser = await playwright.chromium.connectOverCDP(endpointURL);
    try {
        console.log(`Connected! Navigating to ${url}...`);
        const page = await browser.newPage();
        await page.goto(url, { timeout: 2 * 60 * 1000 });
        console.log(`Navigated! Scraping page content...`);
        const data = await page.content();
        console.log(`Scraped! Data: ${data}`);
    } finally {
        await browser.close();
    }
}

if (require.main == module) {
    scrape().catch(error => {
        console.error(error.stack || error.message || error);
        process.exit(1);
    });
}

Optimizing Bandwidth Usage with Browser API

When optimizing your web scraping projects, conserving bandwidth is key. Explore our tips and guidelines below to utilize bandwidth-saving techniques within your script and ensure efficient, resource-friendly scraping.

Avoid Unnecessary Media Content

Downloading unnecessary media (images, videos) is a common bandwidth drain. You can block these resources directly within your script.
Resource-blocking can occasionally impact page loading due to anti-bot expectations. If you see issues after blocking resources, revert your blocking logic before contacting support.
 const page = await browser.newPage();  
  
  // Enable request interception  
  await page.setRequestInterception(true);  
  
  // Listen for requests  
  page.on('request', (request) => {  
    if (request.resourceType() === 'image') {  
      // If the request is for an image, block it  
      request.abort();  
    } else {  
      // If it's not an image request, allow it to continue  
      request.continue();  
   }  
 });

Block Unnecessary Network Requests

Blocking media type requests alone may not always reduce your bandwidth usage. Some websites have ad spaces that continuously refresh ads, and others use live bidding mechanisms that constantly search for new ads if one fails to load properly. In such cases, it’s important to identify and block these specific network requests. Doing so will decrease the number of network requests and, consequently, lower your bandwidth usage.
Example
 const blocked_resources = [
     "image",
     "stylesheet",
     "font",
     "media",
     "svg"
 ];
 
 const blocked_urls = [
     'www.googletagmanager.com/gtm.js',
     'cdn.adapex.io/hb',
     'pagead2.googlesyndication.com/',
 ];
 
 await page.setRequestInterception(true);
 
 page.on('request', request => {
     counter++;
     const is_url_blocked = blocked_urls.some(p => request.url().includes(p));
     const is_resource_blocked = blocked_resources.includes(request.resourceType());
     if (is_url_blocked || is_resource_blocked) {
         request.abort();
     } else {
         request.continue();
     }
 });

Use Cached Pages Efficiently

One common inefficiency in scraping jobs is the repeated downloading of the same page during a single session. Leveraging cached pages - a version of a previously scraped page - can significantly increase your scraping efficiency, as it can be used to avoid repeated network requests to the same domain. Not only does it save on bandwidth by avoiding redundant fetches, but it also ensures faster and more responsive interactions with the preloaded content.

Code Example

The selectors used in this example (.product-name, .product-price, .product-link, .apply-coupon-button) are generic placeholders. Please update these to match the actual HTML structure of the website you are scraping. Also, make sure to replace https://example.com with your target URL.
Puppeteer
const puppeteer = require('puppeteer-core');
const AUTH = 'USER:PASS';
const SBR_WS_ENDPOINT = `wss://${AUTH}@brd.superproxy.io:9222`;

async function scrapeProductDetails(link) {
    console.log('Connecting to Scraping Browser...');
    const browser = await puppeteer.connect({
        browserWSEndpoint: SBR_WS_ENDPOINT,
    });
    try {
        console.log(`Connected! Navigating to: ${link}`);
        await page.goto(link, { timeout: 2 * 60 * 1000 });

        // Wait for and extract product name
        await page.waitForSelector('.product-name', { timeout: 30000 });
        const productName = await page.$eval('.product-name', el => el.textContent.trim());

        // Try to apply coupon if button exists
        const couponButton = await page.$('.apply-coupon-button');
        if (couponButton) {
            await couponButton.click();
        }

        // Extract price
        await page.waitForSelector('.product-price', { timeout: 30000 });
        const productPrice = await page.$eval('.product-price', el => el.textContent.trim());

        return { productName, productPrice, link };
    } catch (error) {
        console.error(`Error scraping ${link}:`, error.message);
        return null;
    } finally {
        await browser.close();
    }
}

async function main() {
    console.log('Connecting to Scraping Browser...');
    const browser = await puppeteer.connect({
        browserWSEndpoint: SBR_WS_ENDPOINT,
    });

    try {
        console.log('Connected! Navigating to listing page...');
        const page = await browser.newPage();
        await page.goto('https://example.com', {
            timeout: 2 * 60 * 1000
        });

        await page.waitForSelector('.product-link', { timeout: 30000 });

        // Extract product links from the listing page
        const productLinks = await page.$$eval('.product-link', links =>
            links.map(link => link.href).slice(0, 10) // Limit to first 10 for testing
        );

        console.log(`Found ${productLinks.length} products`);
        await browser.close();

        // Scrape product details in parallel
        const productDetailsPromises = productLinks.map(link => scrapeProductDetails(link));
        const productDetails = await Promise.all(productDetailsPromises);

        // Filter out any null results from failed scrapes
        const validProductDetails = productDetails.filter(details => details !== null);

        console.log('Scraped product details:', validProductDetails);
    } catch (error) {
        console.error('Error during the main process:', error);
    }
}

main();

Other Strategies

  • Limit Your Requests: Scrape only the data you need.
  • Concurrency Control: Avoid opening too many concurrent pages; this can overload resources.
  • Session Management: Properly close sessions to save resources and bandwidth.
  • Opt for APIs: Use official APIs when available—they’re often less bandwidth-intense.
  • Fetch Incremental Data: Only scrape new/updated content, not the entire dataset every time.