Webmaster Console
Gain better control over data collection from your website
What is the Bright Data Webmaster Console?
The Webmasters can configure a collectors.txt file to inform Bright Data about information important to data collectors, like the presence of personal information and more.
Webmasters can configure a collectors.txt file to inform Bright Data about interactive endpoints present on their websites.
The Webmaster Console offers a practical and informative solution for managing Bright Data traffic on your website.
-
User-friendly control panel
-
Round-trip time (RTT) statistics for website health tracking
What is a collectors.txt?
BrightBot enforces the robots.txt guidelines; however, it’s important to note that robots.txt was initially designed to guide search engine crawlers, not public web data collectors. There is a wealth of additional information that responsible data collectors should be aware of to ensure proper and respectful data interaction with your website.
Key considerations include the presence of personal information, which should be handled in compliance with applicable privacy laws. Furthermore, many public endpoints on your website may have limited resources. By communicating these limits, you can help in preventing unintentional overloading of various resources.
Bright Data will review collectors.txt information prior to implementation, with authentication tokens from partner cybersecurity companies as an exception. The decision whether to accept certain webpages with their collectors.txt is at Bright Data’s discretion, and Bright Data is not obligated to accept any requests, nor will Bright Data be liable for any consequences arising from unapproved requests.
-
Enhance transparency by monitoring how Bright Data interacts with your website.
-
Utilize a collectors.txt file to fine-tune access to specific sections of your website.
Webmasters can facilitate a more efficient approach for BrightBot operated by Bright Data to access their website by providing access guidelines within a collectors.txt file via the Webmaster Console. This file may contain the following information:
Category | Description | Applicable Fields |
---|---|---|
Personal Information | Endpoints containing information which are related to an identified or identifiable natural person. | URL / Document Object |
Disallow | List interactive endpoint patterns such as ad links, likes, reviews, and posts. This instruction enables BrightBot to block these endpoints, aligning with our guidelines that prohibit data collection from these areas. | URL / Document Object |
Load | Report your organic traffic load on specific domains or subdomains and on specific time frames. Bright Bot will use this information instead of public load statistics when deciding how it should rate limit itself. | URL / Document Object Rate Time-frame |
Traffic peak time | Define time slots of peak organic traffic, reducing data collection during these times. | URL / Document Object Date | Weekday | Any Start time / End time |
How it works
-
Create a Webmaster Console
-
Authenticate your websites
-
Build a collectors.txt for each site
What is Brightbot?
BrightBot is the name of Bright Data’s crawler layer, which monitors the health of every domain it targets and enforces ethical usage. This crawler is the technology that prevents access to non-public information and blocks interactive endpoints that could be abused, like ad clicks, reviews, likes, account management, etc. After you join the Bright Data Webmaster Console and submit requests to a collectors.txt file, it is Bright Bot that will enforce the ethical data collection from your website as was approved by Bright Data.
Examples & Format
Was this page helpful?