Browser API FAQs - Bright Data Docs

Can I choose the country that the Browser API will scrape from?

This is possible, but not recommended. The reason is that the Browser API utilises Bright Data’s full suite of unblocking capabilities which automatically chooses the best IP type and location to get you the page you want to access.

If you still need the Browser API to launch from a specific country, add the -country flag, after your USER credentials within the Bright Data endpoint, followed by the 2-letter ISO code for that country.

For example, Browser API using Puppeteer in the USA:

const SBR_WS_ENDPOINT = `wss://${USERNAME-country-us:PASSWORD}@brd.superproxy.io:9222`;

EU region

EU region You can target the entire European Union region in the same manner as “Country” above by adding “eu” after “country” in your request: “-country-eu”. Requests sent using -country-eu, will use IPs from one of the countries below which are included automatically within “eu”: AL, AZ, KG, BA, UZ, BI, XK, SM, DE, AT, CH, UK, GB,IE, IM, FR, ES, NL, IT, PT, BE, AD, MT, MC, MA, LU, TN, DZ, GI, LI, SE, DK, FI, NO, AX, IS, GG, JE, EU, GL, VA, FX, FO.

Need Browser API to target a specific geographical radius of proxies?

Check out our Proxy.setLocation feature.

Which programming languages, libraries, and browser automation tools are supported by Browser API?

Bright Data’s Browser API is compatible with a wide variety of programming languages, libraries, and browser automation tools, offering full native support for Node.js, Python, and Java/C# using puppeteer, playwright, and selenium respectively.

Other languages can usually be integrated as well via the third-party libraries listed below, enabling you to incorporate Browser API directly into your existing tech stack.

Language/Platform	puppeteer	playwright	selenium
Python	N/A	*playwright-python	*Selenium WebDriver
JS / Node	*Native	*Native	*WebDriverJS
Java	Puppeteer Java	Playwright for Java	*Native
Ruby	Puppeteer-Ruby	playwright-ruby-client	Selenium WebDriver for Ruby
C#	*.NET: Puppeteer Sharp	Playwright for .NET	*Selenium WebDriver for .NET
Go	chromedp	playwright-go	Selenium WebDriver for Go
*Full support

How can I debug what's happening behind the scenes during my Browser API session?

You can monitor a live Browser API session by launching the Browser API Debugger on your local machine. This is similar to setting headless browser to ‘FALSE’ on Puppeteer.

The Browser API Debugger serves as a valuable resource, enabling you to inspect, analyze, and fine-tune your code alongside Chrome Dev Tools, resulting in better control, visibility, and efficiency.

Where do I find the Browser API Debugger?

The Browser API Debugger can be launched via two methods:

Manually via Control Panel
Remotely via your script.
The Browser API Debugger can be easily accessed within your Bright Data Control Panel. Follow these steps:
1. Within the control panel, go to My Proxies view
2. Click on your Browser API proxy
3. Click on the Overview tab
4. On the right side, Click on the “Chrome Dev Tools Debugger” button
Getting Started with the Debugger & Chrome Dev Tools
1
Open a Browser API Session

Ensure you have an active Browser API session

If you don’t yet know how to launch a Browser API session, see our Quick Start guide.
2
Launch the Debugger

Once your session is up and running you can now launch the Debugger.

Click on the Debugger button within your Overview tab to launch the Browser API Debugger interface (see the screenshot above )
3
Connect with your live browser sessions

Within the Debugger interface, you will find a list of your live Browser API sessions.

Select the preferred session that you wish to debug

Click on the session link or copy/paste it into your browser of choice, and this will establish a connection between the Debugger and your selected session.
The Browser API Debugger can be easily accessed within your Bright Data Control Panel. Follow these steps:
1. Within the control panel, go to My Proxies view
2. Click on your Browser API proxy
3. Click on the Overview tab
4. On the right side, Click on the “Chrome Dev Tools Debugger” button
Getting Started with the Debugger & Chrome Dev Tools
1
Open a Browser API Session

Ensure you have an active Browser API session

If you don’t yet know how to launch a Browser API session, see our Quick Start guide.

2
Launch the Debugger

Once your session is up and running you can now launch the Debugger.

Click on the Debugger button within your Overview tab to launch the Browser API Debugger interface (see the screenshot above )

3
Connect with your live browser sessions

Within the Debugger interface, you will find a list of your live Browser API sessions.

Select the preferred session that you wish to debug

Click on the session link or copy/paste it into your browser of choice, and this will establish a connection between the Debugger and your selected session.
To access and launch the debugger session directly from your script, you’ll need to send the CDP command: Page.inspect.
// Node.js Puppeteer - Inspect page using devtools const page = await browser.newPage(); const client = await page.target().createCDPSession(); const {frameTree: {frame}} = await client.send('Page.getFrameTree', {}); const {url: inspectUrl} = await client.send('Page.inspect', { frameId: frame.id, }); console.log(`Inspect session at ${inspectUrl}`);
Leveraging Chrome Dev Tools
- With the Browser API Debugger now connected to your live session, you gain access to the powerful features of Chrome Dev Tools.
- Utilize the Dev Tools interface to inspect HTML elements, analyze network requests, debug JavaScript code, and monitor performance. Leverage breakpoints, console logging, and other debugging techniques to identify and resolve issues within your code.

How to automatically launch devtools locally to view your live browser session?

If you would like to automatically launch devtools on every session to view your live browser session, you can integrate the following code snippet:

NodeJS - Puppeteer

// Node.js Puppeteer - launch devtools locally  

const {
    exec
} = require('child_process');
const chromeExecutable = 'google-chrome';

const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
const openDevtools = async (page, client) => {
    // get current frameId  
    const frameId = page.mainFrame()._id;
    // get URL for devtools from Browser API  
    const {
        url: inspectUrl
    } = await client.send('Page.inspect', {
        frameId
    });
    // open devtools URL in local chrome  
    exec(`"${chromeExecutable}" "${inspectUrl}"`, error => {
        if (error)
            throw new Error('Unable to open devtools: ' + error);
    });
    // wait for devtools ui to load  
    await delay(5000);
};

const page = await browser.newPage();
const client = await page.target().createCDPSession();
await openDevtools(page, client);
await page.goto('http://example.com');

How can I get a screenshot of what's happening in the browser?

You can easily trigger a screenshot of the browser at any time by adding the following to your code:

NodeJS

// node.js puppeteer - Taking screenshot to file screenshot.png 
await page.screenshot({ path: 'screenshot.png', fullPage: true });

To take screenshots on Python and C# see here.

You can easily trigger a screenshot of the browser at any time by adding the following to your code:

NodeJS

// node.js puppeteer - Taking screenshot to file screenshot.png 
await page.screenshot({ path: 'screenshot.png', fullPage: true });

To take screenshots on Python and C# see here.

See our full section on opening devtools automatically.

Why does the initial navigation for certain pages take longer than others?

There is a lot of “behind the scenes” work that goes into unlocking your targeted site.

Some sites will take just a few seconds for navigation, while others might take even up to a minute or two as they require more complex unlocking procedures. As such, we recommend setting your navigation timeout to “2 minutes” to give the navigation enough time to succeed if needed.

You can set your navigation timeout to 2 minutes by adding the following line in your script before your “page.goto” call.

// node.js puppeteer - Navigate to site with 2 min timeout  
page.goto('<https://example.com>', { timeout: 2*60*1000 });

What are the most Common Error codes?


Error Code	Meaning	What can you do about it?
Unexpected server response: 407	An issue with the remote browser’s port	Please check your remote browser’s port. The correct port for Browser API is port:9222
Unexpected server response: 403	Authentication Error	Check authentication credentials (username, password) and check that you are using the correct “Browser API” zone from Bright Data control panel
Unexpected server response: 503	Service Unavailable	We are likely scaling browsers right now to meet demand. Try to reconnect in 1 minute.

I can't seem to establish a connection with Browser API, do I have a connection issue?

If you’re experiencing connection issues, you can test your local Browser API connection with a simple curl to the following endpoint:

Shell

curl -v -u SBR_ZONE_FULL_USERNAME:SBR_ZONE_PASSWORD https://brd.superproxy.io:9222/json/protocol

If a JSON is returned within the response:

Your authentication and connection to Browser API are successful. Confirm you are using these exact configurations in your script.
If you are still facing issues connecting to Browser API, contact support for further assistance.

If a JSON is not returned within the response:

Check your authentication details from your Browser API zone, and ensure the USER and PASS values are correct.
Verify network Configuration: Confirm your network and/or firewall is not blocking outbound connections to https://brd.superproxy.io:9222.
If the issue persists, please contact support for further assistance.

How to Integrate Browser API with .NET Puppeteer Sharp?

Integration with the Browser API product with C# requires patching the PuppeteerSharp library to add support for websocket authentication. This can be done like the following:

C# PuppeteerSharp

using PuppeteerSharp;  
using System.Net.WebSockets;  
using System.Text;  
  
// Set the authentication credentials  
var auth = "SBR_ZONE_FULL_USERNAME:SBR_ZONE_PASSWORD";  
// Construct the WebSocket URL with authentication  
var ws = $"wss://{auth}@brd.superproxy.io:9222";  
// Custom WebSocket factory function  
async Task<WebSocket> ws_factory(Uri url, IConnectionOptions options, CancellationToken cancellationToken)  
  
{  
    // Create a new ClientWebSocket instance
    var socket = new ClientWebSocket();  
    // Extract the user information (username and password) from the URL  
    var user_info = url.UserInfo;  
    if (user_info != "")  
    {  
        // Encode the user information in Base64 format  
        var auth = Convert.ToBase64String(Encoding.Default.GetBytes(user_info));  
        // Set the "Authorization" header of the WebSocket options with the encoded credentials  
        socket.Options.SetRequestHeader("Authorization", $"Basic {auth}");  
    }  
  
    // Disable the WebSocket keep-alive interval  
    socket.Options.KeepAliveInterval = TimeSpan.Zero;  
    // Connect to the WebSocket endpoint  
    await socket.ConnectAsync(url, cancellationToken);  
    return socket;  
}  
  
// Create ConnectOptions and configure the options  
var options = new ConnectOptions()  
  
{  
    // Set the BrowserWSEndpoint to the WebSocket URL  
    BrowserWSEndpoint = ws,  
    // Set the WebSocketFactory to the custom factory function  
    WebSocketFactory = ws_factory,  
};  
  
// Connect to the browser using PuppeteerSharp  
Console.WriteLine("Connecting to browser...");  
  
using (var browser = await Puppeteer.ConnectAsync(options))  
{  
    Console.WriteLine("Connected! Navigating...");  
    // Create a new page instance  
    var page = await browser.NewPageAsync();  
    // Navigate to the specified URL  
    await page.GoToAsync("https://example.com");  
    Console.WriteLine("Navigated! Scraping data...");  
    // Get the content of the page  
    var content = await page.GetContentAsync();  
    Console.WriteLine("Done!");  
    Console.WriteLine(content);  
}

How does the Browser API pricing work?

What are some tips for reducing bandwidth while scraping with Browser API?

When optimizing your web scraping projects, conserving bandwidth is the key.

Explore our tips and guidelines below on effective bandwidth-saving techniques that you can utilize within your script to ensure efficient and resource-friendly scraping.

Avoid unnecessary media content during scraping

A typical inefficiency when loading a browser is the unnecessary downloading of media content, such as images and videos, from your targeted domains. Learn below how to easily avoid this by excluding them right from within your script.

Given that anti-bot systems expect specific resources to load for particular domains, approach resource-blocking cautiously, as it can directly impact Browser API’s ability to successfully load your target domains. If you encounter any issues after applying resource blocks, please ensure that they persist even when your blocking logic is reverted, before contacting our support team.

 const page = await browser.newPage();  
  
  // Enable request interception  
  await page.setRequestInterception(true);  
  
  // Listen for requests  
  page.on('request', (request) => {  
    if (request.resourceType() === 'image') {  
      // If the request is for an image, block it  
      request.abort();  
    } else {  
      // If it's not an image request, allow it to continue  
      request.continue();  
   }  
 });

 const page = await browser.newPage();  
  
  // Enable request interception  
  await page.setRequestInterception(true);  
  
  // Listen for requests  
  page.on('request', (request) => {  
    if (request.resourceType() === 'image') {  
      // If the request is for an image, block it  
      request.abort();  
    } else {  
      // If it's not an image request, allow it to continue  
      request.continue();  
   }  
 });

 const page = await browser.newPage();  
  
  // Enable request interception  
  await page.setRequestInterception(true);  
  
  // Listen for requests  
  page.on('request', (interceptedRequest) => {  
  
    // Check if the request URL ends with '.png' or '.jpg'  
    if (  
      interceptedRequest.url().endsWith('.png') ||  
      interceptedRequest.url().endsWith('.jpg')  
    ) {  
  
      // If the request is for a PNG or JPG image, block it  
      interceptedRequest.abort();  
    } else {  
      // If it's not a PNG or JPG image request, allow it to continue  
      interceptedRequest.continue();  
   }  
 });

Playwright

 // Create a new context with specific resource types blocked  
  const context = await browser.newContext({  
    fetchResourceTypesToBlock: ['image', 'font']  
  });  
  
  const page = await context.newPage();  
  
  // Navigate to a webpage  
  await page.goto('https://example.com');

Block unnecessary network requests

Blocking media type requests alone may not always reduce your bandwidth usage. Some websites have ad spaces that continuously refresh ads, and others use live bidding mechanisms that constantly search for new ads if one fails to load properly.

In such cases, it’s important to identify and block these specific network requests. Doing so will decrease the number of network requests and, consequently, lower your bandwidth usage.

Example

const blocked_resources = [
    "image",
    "stylesheet",
    "font",
    "media",
    "svg"
];

const blocked_urls = [
    'www.googletagmanager.com/gtm.js',
    'cdn.adapex.io/hb',
    'pagead2.googlesyndication.com/',
];

await page.setRequestInterception(true);

page.on('request', request => {
    counter++;
    const is_url_blocked = blocked_urls.some(p => request.url().includes(p));
    const is_resource_blocked = blocked_resources.includes(request.resourceType());
    if (is_url_blocked || is_resource_blocked) {
        request.abort();
    } else {
        request.continue();
    }
});

Effectively using cached pages

One common inefficiency in scraping jobs is the repeated downloading of the same page during a single session.

Leveraging cached pages - a version of a previously scraped page - can significantly increase your scraping efficiency, as it can be used to avoid repeated network requests to the same domain. Not only does it save on bandwidth by avoiding redundant fetches, but it also ensures faster and more responsive interactions with the preloaded content.

A single Scraping Browser session can persist for up to 30 minutes. This duration allows you ample opportunity to revisit and re-navigate the page as needed within the same session, eliminating the need for redundant sessions on identical pages during your scraping job.

Example In a multi-step web scraping workflow, you often gather links from a page and then dive into each link for more detailed data extraction.

You’ll often need to revisit the initial page for cross-referencing or validation. By leveraging caching, these revisits don’t trigger new network requests as the data is simply loaded from the cache.

Puppeteer

const puppeteer = require('puppeteer-core');
const AUTH = 'USER:PASS';
const SBR_WS_ENDPOINT = `<a href="wss://$" target="_blank">wss://$</a>{<a href="mailto:AUTH}@brd.superproxy.io:9222" target="_blank">AUTH}@brd.superproxy.io:9222</a>`;

async function scrapeProductDetails(link) {
    console.log('Connecting to Scraping Browser...');
    const browser = await puppeteer.connect({
        browserWSEndpoint: SBR_WS_ENDPOINT,
    });
    try {
        console.log('Connected! Navigating...');
        const page = await browser.newPage();
        await page.goto(link, {
            timeout: 2 * 60 * 1000
        });

        // Extract the product's name
        const productName = await page.$eval('.product-name', el => el.textContent);

        // Apply a coupon (assuming it doesn't navigate away)
        await page.click('.apply-coupon-button');

        // Extract the discounted product's price from the cached product detail page
        const productPrice = await page.$eval('.product-price', el => el.textContent);

        return {
            productName,
            productPrice
        };
    } catch (error) {
        console.error(`Error scraping ${link}:`, error);
        return null;
    } finally {
        await browser.close();
    }
}

async function main() {
    console.log('Connecting to Scraping Browser...');
    const browser = await puppeteer.connect({
        browserWSEndpoint: SBR_WS_ENDPOINT,
    });

    try {
        console.log('Connected! Navigating...');
        const page = await browser.newPage();
        await page.goto('<a href="https://example.com" target="_blank">https://example.com</a>', {
            timeout: 2 * 60 * 1000
        });

        // Extract product links from the listing page
        const productLinks = await page.$$eval('.product-link', links => links.map(link => link.href));

        // Close the initial browser as it is no longer needed
        await browser.close();

        // Scrape product details in parallel
        const productDetailsPromises = productLinks.map(link => scrapeProductDetails(link));
        const productDetails = await Promise.all(productDetailsPromises);

        // Filter out any null results from failed scrapes
        const validProductDetails = productDetails.filter(details => details !== null);

        console.log('Scraped product details:', validProductDetails);
    } catch (error) {
        console.error('Error during the main process:', error);
    }
}

main();

Other strategies to minimize bandwidth and ensure efficient scraping

Limit Your Requests: Only scrape what you need, rather than downloading entire webpages or sites.
Concurrency Control: Limit the number of concurrent pages or browsers you open. Too many parallel processes can exhaust resources.
Session Management: Ensure you properly manage and close sessions after scraping. This prevents resource and memory leaks.
Opt for APIs: If the target website offers an API, use it instead of direct scraping. APIs are typically more efficient and less bandwidth-intensive than scraping full web pages.
Fetch Incremental Data: If scraping periodically, try to fetch only new or updated data rather than re-fetching everything.

Is password typing allowed with Browser API?

How can I keep the same IP address in Browser API sessions?

Can I choose the country that the Browser API will scrape from?

For example, Browser API using Puppeteer in the USA:

const SBR_WS_ENDPOINT = `wss://${USERNAME-country-us:PASSWORD}@brd.superproxy.io:9222`;

EU region

Need Browser API to target a specific geographical radius of proxies?

Check out our Proxy.setLocation feature.

Which programming languages, libraries, and browser automation tools are supported by Browser API?

Other languages can usually be integrated as well via the third-party libraries listed below, enabling you to incorporate Browser API directly into your existing tech stack.

Language/Platform	puppeteer	playwright	selenium
Python	N/A	*playwright-python	*Selenium WebDriver
JS / Node	*Native	*Native	*WebDriverJS
Java	Puppeteer Java	Playwright for Java	*Native
Ruby	Puppeteer-Ruby	playwright-ruby-client	Selenium WebDriver for Ruby
C#	*.NET: Puppeteer Sharp	Playwright for .NET	*Selenium WebDriver for .NET
Go	chromedp	playwright-go	Selenium WebDriver for Go
*Full support

How can I debug what's happening behind the scenes during my Browser API session?

You can monitor a live Browser API session by launching the Browser API Debugger on your local machine. This is similar to setting headless browser to ‘FALSE’ on Puppeteer.

Where do I find the Browser API Debugger?

The Browser API Debugger can be launched via two methods:

Manually via Control Panel
Remotely via your script.
The Browser API Debugger can be easily accessed within your Bright Data Control Panel. Follow these steps:
1. Within the control panel, go to My Proxies view
2. Click on your Browser API proxy
3. Click on the Overview tab
4. On the right side, Click on the “Chrome Dev Tools Debugger” button
Getting Started with the Debugger & Chrome Dev Tools
1
Open a Browser API Session

Ensure you have an active Browser API session

If you don’t yet know how to launch a Browser API session, see our Quick Start guide.
2
Launch the Debugger

Once your session is up and running you can now launch the Debugger.

Click on the Debugger button within your Overview tab to launch the Browser API Debugger interface (see the screenshot above )
3
Connect with your live browser sessions

Within the Debugger interface, you will find a list of your live Browser API sessions.

Select the preferred session that you wish to debug

Click on the session link or copy/paste it into your browser of choice, and this will establish a connection between the Debugger and your selected session.
The Browser API Debugger can be easily accessed within your Bright Data Control Panel. Follow these steps:
1. Within the control panel, go to My Proxies view
2. Click on your Browser API proxy
3. Click on the Overview tab
4. On the right side, Click on the “Chrome Dev Tools Debugger” button
Getting Started with the Debugger & Chrome Dev Tools
1
Open a Browser API Session

Ensure you have an active Browser API session

If you don’t yet know how to launch a Browser API session, see our Quick Start guide.

2
Launch the Debugger

Once your session is up and running you can now launch the Debugger.

Click on the Debugger button within your Overview tab to launch the Browser API Debugger interface (see the screenshot above )

3
Connect with your live browser sessions

Within the Debugger interface, you will find a list of your live Browser API sessions.

Select the preferred session that you wish to debug

Click on the session link or copy/paste it into your browser of choice, and this will establish a connection between the Debugger and your selected session.
To access and launch the debugger session directly from your script, you’ll need to send the CDP command: Page.inspect.
// Node.js Puppeteer - Inspect page using devtools const page = await browser.newPage(); const client = await page.target().createCDPSession(); const {frameTree: {frame}} = await client.send('Page.getFrameTree', {}); const {url: inspectUrl} = await client.send('Page.inspect', { frameId: frame.id, }); console.log(`Inspect session at ${inspectUrl}`);
Leveraging Chrome Dev Tools
- With the Browser API Debugger now connected to your live session, you gain access to the powerful features of Chrome Dev Tools.
- Utilize the Dev Tools interface to inspect HTML elements, analyze network requests, debug JavaScript code, and monitor performance. Leverage breakpoints, console logging, and other debugging techniques to identify and resolve issues within your code.

How to automatically launch devtools locally to view your live browser session?

If you would like to automatically launch devtools on every session to view your live browser session, you can integrate the following code snippet:

NodeJS - Puppeteer

// Node.js Puppeteer - launch devtools locally  

const {
    exec
} = require('child_process');
const chromeExecutable = 'google-chrome';

const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
const openDevtools = async (page, client) => {
    // get current frameId  
    const frameId = page.mainFrame()._id;
    // get URL for devtools from Browser API  
    const {
        url: inspectUrl
    } = await client.send('Page.inspect', {
        frameId
    });
    // open devtools URL in local chrome  
    exec(`"${chromeExecutable}" "${inspectUrl}"`, error => {
        if (error)
            throw new Error('Unable to open devtools: ' + error);
    });
    // wait for devtools ui to load  
    await delay(5000);
};

const page = await browser.newPage();
const client = await page.target().createCDPSession();
await openDevtools(page, client);
await page.goto('http://example.com');

How can I get a screenshot of what's happening in the browser?

You can easily trigger a screenshot of the browser at any time by adding the following to your code:

NodeJS

// node.js puppeteer - Taking screenshot to file screenshot.png 
await page.screenshot({ path: 'screenshot.png', fullPage: true });

To take screenshots on Python and C# see here.

You can easily trigger a screenshot of the browser at any time by adding the following to your code:

NodeJS

// node.js puppeteer - Taking screenshot to file screenshot.png 
await page.screenshot({ path: 'screenshot.png', fullPage: true });

To take screenshots on Python and C# see here.

See our full section on opening devtools automatically.

Why does the initial navigation for certain pages take longer than others?

There is a lot of “behind the scenes” work that goes into unlocking your targeted site.

You can set your navigation timeout to 2 minutes by adding the following line in your script before your “page.goto” call.

// node.js puppeteer - Navigate to site with 2 min timeout  
page.goto('<https://example.com>', { timeout: 2*60*1000 });

What are the most Common Error codes?


Error Code	Meaning	What can you do about it?
Unexpected server response: 407	An issue with the remote browser’s port	Please check your remote browser’s port. The correct port for Browser API is port:9222
Unexpected server response: 403	Authentication Error	Check authentication credentials (username, password) and check that you are using the correct “Browser API” zone from Bright Data control panel
Unexpected server response: 503	Service Unavailable	We are likely scaling browsers right now to meet demand. Try to reconnect in 1 minute.

I can't seem to establish a connection with Browser API, do I have a connection issue?

If you’re experiencing connection issues, you can test your local Browser API connection with a simple curl to the following endpoint:

Shell

curl -v -u SBR_ZONE_FULL_USERNAME:SBR_ZONE_PASSWORD https://brd.superproxy.io:9222/json/protocol

If a JSON is returned within the response:

Your authentication and connection to Browser API are successful. Confirm you are using these exact configurations in your script.
If you are still facing issues connecting to Browser API, contact support for further assistance.

If a JSON is not returned within the response:

Check your authentication details from your Browser API zone, and ensure the USER and PASS values are correct.
Verify network Configuration: Confirm your network and/or firewall is not blocking outbound connections to https://brd.superproxy.io:9222.
If the issue persists, please contact support for further assistance.

How to Integrate Browser API with .NET Puppeteer Sharp?

Integration with the Browser API product with C# requires patching the PuppeteerSharp library to add support for websocket authentication. This can be done like the following:

C# PuppeteerSharp

using PuppeteerSharp;  
using System.Net.WebSockets;  
using System.Text;  
  
// Set the authentication credentials  
var auth = "SBR_ZONE_FULL_USERNAME:SBR_ZONE_PASSWORD";  
// Construct the WebSocket URL with authentication  
var ws = $"wss://{auth}@brd.superproxy.io:9222";  
// Custom WebSocket factory function  
async Task<WebSocket> ws_factory(Uri url, IConnectionOptions options, CancellationToken cancellationToken)  
  
{  
    // Create a new ClientWebSocket instance
    var socket = new ClientWebSocket();  
    // Extract the user information (username and password) from the URL  
    var user_info = url.UserInfo;  
    if (user_info != "")  
    {  
        // Encode the user information in Base64 format  
        var auth = Convert.ToBase64String(Encoding.Default.GetBytes(user_info));  
        // Set the "Authorization" header of the WebSocket options with the encoded credentials  
        socket.Options.SetRequestHeader("Authorization", $"Basic {auth}");  
    }  
  
    // Disable the WebSocket keep-alive interval  
    socket.Options.KeepAliveInterval = TimeSpan.Zero;  
    // Connect to the WebSocket endpoint  
    await socket.ConnectAsync(url, cancellationToken);  
    return socket;  
}  
  
// Create ConnectOptions and configure the options  
var options = new ConnectOptions()  
  
{  
    // Set the BrowserWSEndpoint to the WebSocket URL  
    BrowserWSEndpoint = ws,  
    // Set the WebSocketFactory to the custom factory function  
    WebSocketFactory = ws_factory,  
};  
  
// Connect to the browser using PuppeteerSharp  
Console.WriteLine("Connecting to browser...");  
  
using (var browser = await Puppeteer.ConnectAsync(options))  
{  
    Console.WriteLine("Connected! Navigating...");  
    // Create a new page instance  
    var page = await browser.NewPageAsync();  
    // Navigate to the specified URL  
    await page.GoToAsync("https://example.com");  
    Console.WriteLine("Navigated! Scraping data...");  
    // Get the content of the page  
    var content = await page.GetContentAsync();  
    Console.WriteLine("Done!");  
    Console.WriteLine(content);  
}

How does the Browser API pricing work?

What are some tips for reducing bandwidth while scraping with Browser API?

When optimizing your web scraping projects, conserving bandwidth is the key.

Explore our tips and guidelines below on effective bandwidth-saving techniques that you can utilize within your script to ensure efficient and resource-friendly scraping.

Avoid unnecessary media content during scraping

 const page = await browser.newPage();  
  
  // Enable request interception  
  await page.setRequestInterception(true);  
  
  // Listen for requests  
  page.on('request', (request) => {  
    if (request.resourceType() === 'image') {  
      // If the request is for an image, block it  
      request.abort();  
    } else {  
      // If it's not an image request, allow it to continue  
      request.continue();  
   }  
 });

 const page = await browser.newPage();  
  
  // Enable request interception  
  await page.setRequestInterception(true);  
  
  // Listen for requests  
  page.on('request', (request) => {  
    if (request.resourceType() === 'image') {  
      // If the request is for an image, block it  
      request.abort();  
    } else {  
      // If it's not an image request, allow it to continue  
      request.continue();  
   }  
 });

 const page = await browser.newPage();  
  
  // Enable request interception  
  await page.setRequestInterception(true);  
  
  // Listen for requests  
  page.on('request', (interceptedRequest) => {  
  
    // Check if the request URL ends with '.png' or '.jpg'  
    if (  
      interceptedRequest.url().endsWith('.png') ||  
      interceptedRequest.url().endsWith('.jpg')  
    ) {  
  
      // If the request is for a PNG or JPG image, block it  
      interceptedRequest.abort();  
    } else {  
      // If it's not a PNG or JPG image request, allow it to continue  
      interceptedRequest.continue();  
   }  
 });

Playwright

 // Create a new context with specific resource types blocked  
  const context = await browser.newContext({  
    fetchResourceTypesToBlock: ['image', 'font']  
  });  
  
  const page = await context.newPage();  
  
  // Navigate to a webpage  
  await page.goto('https://example.com');

Block unnecessary network requests

In such cases, it’s important to identify and block these specific network requests. Doing so will decrease the number of network requests and, consequently, lower your bandwidth usage.

Example

const blocked_resources = [
    "image",
    "stylesheet",
    "font",
    "media",
    "svg"
];

const blocked_urls = [
    'www.googletagmanager.com/gtm.js',
    'cdn.adapex.io/hb',
    'pagead2.googlesyndication.com/',
];

await page.setRequestInterception(true);

page.on('request', request => {
    counter++;
    const is_url_blocked = blocked_urls.some(p => request.url().includes(p));
    const is_resource_blocked = blocked_resources.includes(request.resourceType());
    if (is_url_blocked || is_resource_blocked) {
        request.abort();
    } else {
        request.continue();
    }
});

Effectively using cached pages

One common inefficiency in scraping jobs is the repeated downloading of the same page during a single session.

Example In a multi-step web scraping workflow, you often gather links from a page and then dive into each link for more detailed data extraction.

Puppeteer

const puppeteer = require('puppeteer-core');
const AUTH = 'USER:PASS';
const SBR_WS_ENDPOINT = `<a href="wss://$" target="_blank">wss://$</a>{<a href="mailto:AUTH}@brd.superproxy.io:9222" target="_blank">AUTH}@brd.superproxy.io:9222</a>`;

async function scrapeProductDetails(link) {
    console.log('Connecting to Scraping Browser...');
    const browser = await puppeteer.connect({
        browserWSEndpoint: SBR_WS_ENDPOINT,
    });
    try {
        console.log('Connected! Navigating...');
        const page = await browser.newPage();
        await page.goto(link, {
            timeout: 2 * 60 * 1000
        });

        // Extract the product's name
        const productName = await page.$eval('.product-name', el => el.textContent);

        // Apply a coupon (assuming it doesn't navigate away)
        await page.click('.apply-coupon-button');

        // Extract the discounted product's price from the cached product detail page
        const productPrice = await page.$eval('.product-price', el => el.textContent);

        return {
            productName,
            productPrice
        };
    } catch (error) {
        console.error(`Error scraping ${link}:`, error);
        return null;
    } finally {
        await browser.close();
    }
}

async function main() {
    console.log('Connecting to Scraping Browser...');
    const browser = await puppeteer.connect({
        browserWSEndpoint: SBR_WS_ENDPOINT,
    });

    try {
        console.log('Connected! Navigating...');
        const page = await browser.newPage();
        await page.goto('<a href="https://example.com" target="_blank">https://example.com</a>', {
            timeout: 2 * 60 * 1000
        });

        // Extract product links from the listing page
        const productLinks = await page.$$eval('.product-link', links => links.map(link => link.href));

        // Close the initial browser as it is no longer needed
        await browser.close();

        // Scrape product details in parallel
        const productDetailsPromises = productLinks.map(link => scrapeProductDetails(link));
        const productDetails = await Promise.all(productDetailsPromises);

        // Filter out any null results from failed scrapes
        const validProductDetails = productDetails.filter(details => details !== null);

        console.log('Scraped product details:', validProductDetails);
    } catch (error) {
        console.error('Error during the main process:', error);
    }
}

main();

Other strategies to minimize bandwidth and ensure efficient scraping

Limit Your Requests: Only scrape what you need, rather than downloading entire webpages or sites.
Concurrency Control: Limit the number of concurrent pages or browsers you open. Too many parallel processes can exhaust resources.
Session Management: Ensure you properly manage and close sessions after scraping. This prevents resource and memory leaks.
Opt for APIs: If the target website offers an API, use it instead of direct scraping. APIs are typically more efficient and less bandwidth-intensive than scraping full web pages.
Fetch Incremental Data: If scraping periodically, try to fetch only new or updated data rather than re-fetching everything.

Is password typing allowed with Browser API?

How can I keep the same IP address in Browser API sessions?