Learn how to set up and configure Bright Data’s Scraping Browser with your credentials, run sample scripts, and navigate live browser sessions. Ensure efficient web scraping with our detailed instructions.
To get started, grab your credentials - the Username and Password you will use with your web automation tool. You can find them in the Scraping Browser zone you just created, in the “Overview” tab. We assume that you aleady have your preferred web automation tool installed. If not, please instal it.
Run these basic examples to check that your Scraping Browser is working (remember to swap in your credentials and target URL):
const puppeteer =require('puppeteer-core');// Enter your zone name and password belowconstAUTH='USER:PASS';constSBR_WS_ENDPOINT=`wss://${AUTH}@brd.superproxy.io:9222`;asyncfunctionmain(){console.log('Connecting to Scraping Browser...');const browser =await puppeteer.connect({browserWSEndpoint:SBR_WS_ENDPOINT,});try{console.log('Connected! Navigating...');const page =await browser.newPage();// Enter your test URL belowawait page.goto('https://example.com',{timeout:2*60*1000});console.log('Taking screenshot to page.png');await page.screenshot({path:'./page.png',fullPage:true});console.log('Navigated! Scraping page content...');const html =await page.content();console.log(html)// CAPTCHA solving: If you know you are likely to encounter a CAPTCHA on your target page, add the following few lines of code to get the status of Scraping Browser's automatic CAPTCHA solver // Note 1: If no captcha was found it will return not_detected status after detectTimeout // Note 2: Once a CAPTCHA is solved, if there is a form to submit, it will be submitted by default // const client = await page.target().createCDPSession(); // const {status} = await client.send('Captcha.solve', {detectTimeout: 30*1000}); // console.log(`Captcha solve status: ${status}`) }finally{await browser.close();}}if(require.main=== module){main().catch(err=>{console.error(err.stack|| err); process.exit(1);});}
const puppeteer =require('puppeteer-core');// Enter your zone name and password belowconstAUTH='USER:PASS';constSBR_WS_ENDPOINT=`wss://${AUTH}@brd.superproxy.io:9222`;asyncfunctionmain(){console.log('Connecting to Scraping Browser...');const browser =await puppeteer.connect({browserWSEndpoint:SBR_WS_ENDPOINT,});try{console.log('Connected! Navigating...');const page =await browser.newPage();// Enter your test URL belowawait page.goto('https://example.com',{timeout:2*60*1000});console.log('Taking screenshot to page.png');await page.screenshot({path:'./page.png',fullPage:true});console.log('Navigated! Scraping page content...');const html =await page.content();console.log(html)// CAPTCHA solving: If you know you are likely to encounter a CAPTCHA on your target page, add the following few lines of code to get the status of Scraping Browser's automatic CAPTCHA solver // Note 1: If no captcha was found it will return not_detected status after detectTimeout // Note 2: Once a CAPTCHA is solved, if there is a form to submit, it will be submitted by default // const client = await page.target().createCDPSession(); // const {status} = await client.send('Captcha.solve', {detectTimeout: 30*1000}); // console.log(`Captcha solve status: ${status}`) }finally{await browser.close();}}if(require.main=== module){main().catch(err=>{console.error(err.stack|| err); process.exit(1);});}
import asyncio from playwright.async_api import async_playwright AUTH ='USER:PASS'SBR_WS_CDP =f'wss://{AUTH}@brd.superproxy.io:9222'asyncdefrun(pw):print('Connecting to Scraping Browser...') browser =await pw.chromium.connect_over_cdp(SBR_WS_CDP)try:print('Connected! Navigating...') page =await browser.new_page()await page.goto('https://example.com', timeout=2*60*1000)print('Taking page screenshot to file page.png')await page.screenshot(path='./page.png', full_page=True)print('Navigated! Scraping page content...') html =await page.content()print(html)# CAPTCHA solving: If you know you are likely to encounter a CAPTCHA on your target page, add the following few lines of code to get the status of Scraping Browser's automatic CAPTCHA solver # Note 1: If no captcha was found it will return not_detected status after detectTimeout # Note 2: Once a CAPTCHA is solved, if there is a form to submit, it will be submitted by default # client = await page.context.new_cdp_session(page) # solve_result = await client.send('Captcha.solve', { 'detectTimeout': 30*1000 }) # status = solve_result['status'] # print(f'Captcha solve status: {status}') finally:await browser.close()asyncdefmain():asyncwith async_playwright()as playwright:await run(playwright)if _name_ =='_main_': asyncio.run(main())
The Scraping Browser Debugger enables developers to inspect, analyze, and fine-tune their code alongside Chrome Dev Tools, resulting in better control, visibility, and efficiency. You can integrate the following code snippet to launch devtools automatically for every session:
// Node.js Puppeteer - launch devtools locally const{ exec }=require('child_process');const chromeExecutable ='google-chrome';constdelay=ms=>newPromise(resolve=>setTimeout(resolve, ms));constopenDevtools=async(page, client)=>{// get current frameId const frameId = page.mainFrame()._id;// get URL for devtools from scraping browser const{url: inspectUrl }=await client.send('Page.inspect',{ frameId });// open devtools URL in local chrome exec(`"${chromeExecutable}" "${inspectUrl}"`,error=>{if(error)thrownewError('Unable to open devtools: '+ error);});// wait for devtools ui to load awaitdelay(5000);};const page =await browser.newPage();const client =await page.target().createCDPSession();awaitopenDevtools(page, client);await page.goto('http://example.com');
Scraping Browser sessions are structured to allow one initial navigation per session. This initial navigation refers to the first instance of loading the target site from which data is to be extracted. Following this, users are free to navigate the site using clicks, scrolls, and other interactive actions within the same session. However, to start a new scraping job, either on the same site or a different one, from the initial navigation stage, it is necessary to begin a new session.
Scraping Browser has 2 kinds of timeouts aimed to safeguard our customers from uncontrolled usage.
Idle Session Timeout: in case a browser session is kept open for 5 minutes and above in an idle mode, meaning no usage going through it, Scraping Browser will automatically timeout the session.
Maximum Session Length Timeout: Scraping Browser session can last up to 30 minutes. Once the maximum session time is reached the session will automatically timeout.