- Web Unlocker
- SERP API
- Scraping Browser
- Introduction
- Quick Start
- Configuration
- Features
- CDP Functions
- Code Examples
- FAQs
- Web Scraper IDE
- Web Data APIs
- Browser Extension
- Bright Shield
Scraping Browser FAQs
Find answers to common questions about Bright Data’s Scraping Browser, including supported languages, debugging tips, and integration guidelines.
Bright Data’s Scraping Browser supports a wide range of programming languages and libraries. We currently have full native support for Node.js and Python using puppeteer, playwright, and selenium, and other languages can be integrated as well using the other libraries below, giving you the flexibility to integrate Scraping Browser right into your current tech stack.
Language/Platform | puppeteer | playwright | selenium |
---|---|---|---|
Python | N/A | playwright-python | Selenium WebDriver |
JS / Node | Native support | Native support | WebDriverJS |
Ruby | Puppeteer-Ruby | playwright-ruby-client | Selenium WebDriver for Ruby |
C# | .NET: Puppeteer Sharp | Playwright for .NET | Selenium WebDriver for .NET |
Java | Puppeteer Java | Playwright for Java | Native support |
Go | chromedp | playwright-go | Selenium WebDriver for Go |
You can see in real-time what the Scraping Browser is doing on your local machine. This is similar to using setting headless browser to ‘FALSE’ on Puppeteer.
Web scraping projects often require intricate interactions with target websites and debugging is vital for identifying and resolving issues found during the development process.
The Scraping Browser Debugger serves as a valuable resource, enabling you to inspect, analyze, and fine-tune your code alongside Chrome Dev Tools, resulting in better control, visibility, and efficiency.
You can see in real-time what the Scraping Browser is doing on your local machine. This is similar to using setting headless browser to ‘FALSE’ on Puppeteer.
Our Scraping Browser Debugger can be launched via two methods: Manually via Control Panel OR Remotely via your script.
The Scraping Browser Debugger can be easily accessed within your Bright Data Control Panel. Follow these steps:
- Within the control panel, go to My Proxies view
- Click on your Scraping Browser proxy
- Click on the Access parameters or Overview tab
- On the right side, Click on the “Chrome Dev Tools Debugger” button
Getting Started with the Debugger & Chrome Dev Tools
Open a Scraping Browser Session
- Ensure you have an active Scraping Browser session
- If you don’t yet know how to launch a scraping browser session, see our Quick Start guide.
Launch the Debugger
- Once your session is up and running you can now launch the Debugger.
- Click on the Debugger button within your Access parameters tab to launch the Scraping Browser Debugger interface (see the screenshot above )
Connect with your live browser sessions
- Within the Debugger interface, you will find a list of your live Scraping Browser sessions.
- Select the preferred session that you wish to debug
- Click on the session link or copy/paste it into your browser of choice, and this will establish a connection between the Debugger and your selected session.
Leveraging Chrome Dev Tools
- With the Scraping Browser Debugger now connected to your live session, you gain access to the powerful features of Chrome Dev Tools.
- Utilize the Dev Tools interface to inspect HTML elements, analyze network requests, debug JavaScript code, and monitor performance. Leverage breakpoints, console logging, and other debugging techniques to identify and resolve issues within your code.
If you would like to automatically launch devtools on every session to view your live browser session, you can integrate the following code snippet:
// Node.js Puppeteer - launch devtools locally
const {
exec
} = require('child_process');
const chromeExecutable = 'google-chrome';
const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
const openDevtools = async (page, client) => {
// get current frameId
const frameId = page.mainFrame()._id;
// get URL for devtools from scraping browser
const {
url: inspectUrl
} = await client.send('Page.inspect', {
frameId
});
// open devtools URL in local chrome
exec(`"${chromeExecutable}" "${inspectUrl}"`, error => {
if (error)
throw new Error('Unable to open devtools: ' + error);
});
// wait for devtools ui to load
await delay(5000);
};
const page = await browser.newPage();
const client = await page.target().createCDPSession();
await openDevtools(page, client);
await page.goto('http://example.com');
Debugger Walkthrough
Check out the Scraping Browser Debugger in action below
<inser-video-here>
You can easily trigger a screenshot of the browser at any time by adding the following to your code:
// node.js puppeteer - Taking screenshot to file screenshot.png
await page.screenshot({ path: 'screenshot.png', fullPage: true });
To take screenshots on Python and C# see here.
There is a lot of “behind the scenes” work that goes into unlocking your targeted site.
Some sites will take just a few seconds for navigation, while others might take even up to a minute or two as they require more complex unlocking procedures. As such, we recommend setting your navigation timeout to “2 minutes” to give the navigation enough time to succeed if needed.
You can set your navigation timeout to 2 minutes by adding the following line in your script before your “page.goto” call.
Error Code | Meaning | What can you do about it? |
Unexpected server response: 407 | An issue with the remote browser’s port | Please check your remote browser’s port. The correct port for Scraping Browser is port:9222 |
Unexpected server response: 403 | Authentication Error | Check authentication credentials (username, password) and check that you are using the correct “Browser API” zone from Bright Data control panel |
Unexpected server response: 503 | Service Unavailable | We are likely scaling browsers right now to meet demand. Try to reconnect in 1 minute. |
You can check your connection with the following curl:
curl -v -u USER:PASS https://brd.superproxy.io:9222/json/protocol
For any other issues please contact support.
Integration with the Scraping browser product with C# requires patching the PuppeteerSharp library to add support for websocket authentication. This can be done like the following:
using PuppeteerSharp;
using System.Net.WebSockets;
using System.Text;
// Set the authentication credentials
var auth = "USER:PASS";
// Construct the WebSocket URL with authentication
var ws = $"wss://{auth}@zproxy.lum-superproxy.io:9222";
// Custom WebSocket factory function
async Task<WebSocket> ws_factory(Uri url, IConnectionOptions options, CancellationToken cancellationToken)
{
// Create a new ClientWebSocket instance
var socket = new ClientWebSocket();
// Extract the user information (username and password) from the URL
var user_info = url.UserInfo;
if (user_info != "")
{
// Encode the user information in Base64 format
var auth = Convert.ToBase64String(Encoding.Default.GetBytes(user_info));
// Set the "Authorization" header of the WebSocket options with the encoded credentials
socket.Options.SetRequestHeader("Authorization", $"Basic {auth}");
}
// Disable the WebSocket keep-alive interval
socket.Options.KeepAliveInterval = TimeSpan.Zero;
// Connect to the WebSocket endpoint
await socket.ConnectAsync(url, cancellationToken);
return socket;
}
// Create ConnectOptions and configure the options
var options = new ConnectOptions()
{
// Set the BrowserWSEndpoint to the WebSocket URL
BrowserWSEndpoint = ws,
// Set the WebSocketFactory to the custom factory function
WebSocketFactory = ws_factory,
};
// Connect to the browser using PuppeteerSharp
Console.WriteLine("Connecting to browser...");
using (var browser = await Puppeteer.ConnectAsync(options))
{
Console.WriteLine("Connected! Navigating...");
// Create a new page instance
var page = await browser.NewPageAsync();
// Navigate to the specified URL
await page.GoToAsync("https://example.com");
Console.WriteLine("Navigated! Scraping data...");
// Get the content of the page
var content = await page.GetContentAsync();
Console.WriteLine("Done!");
Console.WriteLine(content);
}