To get started, grab your credentials - the Username and Password you will use with your web automation tool. You can find them in the Scraping Browser zone you just created, in the “Overview” tab. We assume that you aleady have your preferred web automation tool installed. If not, please instal it.

Sample Code

Run these basic examples to check that your Scraping Browser is working (remember to swap in your credentials and target URL):

Run the Script

Save the above code as script.js (don’t forget to enter your credentials!) and run it using this command:

node script.js

View live browser session

The Scraping Browser Debugger enables developers to inspect, analyze, and fine-tune their code alongside Chrome Dev Tools, resulting in better control, visibility, and efficiency. You can integrate the following code snippet to launch devtools automatically for every session:

// Node.js Puppeteer - launch devtools locally  
  
const { exec } = require('child_process');  
const chromeExecutable = 'google-chrome';  
  
const delay = ms => new Promise(resolve => setTimeout(resolve, ms));  
const openDevtools = async (page, client) => {  
    // get current frameId  
    const frameId = page.mainFrame()._id;  
    // get URL for devtools from scraping browser  
    const { url: inspectUrl } = await client.send('Page.inspect', { frameId });  
    // open devtools URL in local chrome  
    exec(`"${chromeExecutable}" "${inspectUrl}"`, error => {  
        if (error)  
            throw new Error('Unable to open devtools: ' + error);  
    });  
    // wait for devtools ui to load  
    await delay(5000);  
};  
  
const page = await browser.newPage();  
const client = await page.target().createCDPSession();  
await openDevtools(page, client);  
await page.goto('http://example.com');

Single Navigation Per Session

Scraping Browser sessions are structured to allow one initial navigation per session. This initial navigation refers to the first instance of loading the target site from which data is to be extracted. Following this, users are free to navigate the site using clicks, scrolls, and other interactive actions within the same session. However, to start a new scraping job, either on the same site or a different one, from the initial navigation stage, it is necessary to begin a new session.

Session Time Limits

Scraping Browser has 2 kinds of timeouts aimed to safeguard our customers from uncontrolled usage.

  1. Idle Session Timeout: in case a browser session is kept open for 5 minutes and above in an idle mode, meaning no usage going through it, Scraping Browser will automatically timeout the session.
  2. Maximum Session Length Timeout: Scraping Browser session can last up to 30 minutes. Once the maximum session time is reached the session will automatically timeout.