LLM-friendly documentation references

Overview

Bright Data publishes two machine-readable documentation files following the llms.txt standard:

File	URL	Best for
`llms.txt`	`docs.brightdata.com/llms.txt`	Agent awareness, navigation, retrieval routing
`llms-full.txt`	`docs.brightdata.com/llms-full.txt`	RAG pipelines, context injection, fine-tuning

llms.txt - Documentation Index

llms.txt is a structured, markdown-formatted index of all Bright Data documentation pages - one entry per page with a short description and a direct link. Format:

# Bright Data Docs

## Docs

- [Agent Web Access](/ai/agents.md): Complete web infrastructure for AI agents
- [SERP API Introduction](/scraping-automation/serp-api/introduction.md): Real-time search results
- [Web Unlocker](/scraping-automation/web-unlocker/introduction.md): Bypass bot detection
...

Note that every link points to the .md version of the page - clean markdown, no HTML. Use it when:

Loading into an agent’s system prompt for full product awareness
Feeding a retrieval system to decide which doc pages to fetch
Giving a coding agent a map of available products before it starts a task

# Quick preview
curl https://docs.brightdata.com/llms.txt | head -40

# Download for offline use
curl -o brightdata-llms.txt https://docs.brightdata.com/llms.txt

llms-full.txt - Complete Documentation

llms-full.txt contains the complete text of all Bright Data documentation in a single file - clean markdown, no HTML, no navigation chrome. Use it when:

Building a RAG pipeline over Bright Data docs
Injecting full product knowledge into a long-context model (Gemini 1.5 Pro, Claude, etc.)
Creating a fine-tuning or evaluation dataset
Giving an agent complete offline reference

# Download
curl -o brightdata-llms-full.txt https://docs.brightdata.com/llms-full.txt

llms-full.txt is large. For real-time agent sessions, loading llms.txt first and fetching specific pages on demand is more token-efficient.

Loading into your agent

Claude Code
Cursor / Windsurf
RAG Pipeline
OpenAI / Custom LLM

Reference the file directly in a prompt - Claude Code will fetch and read it:

Please read https://docs.brightdata.com/llms.txt to understand the available
Bright Data products, then help me choose the right API for scraping Amazon product pages.

Or save it as a project context file:

mkdir -p .claude
curl -o .claude/brightdata-docs.txt https://docs.brightdata.com/llms.txt

Then reference it in your CLAUDE.md or system prompt:

# Project context
See .claude/brightdata-docs.txt for the full Bright Data product reference.

Add as a project rules file so your agent has it in context automatically:

# Save to project rules directory
curl -o .cursor/rules/brightdata.md https://docs.brightdata.com/llms.txt
# or for Windsurf:
curl -o .windsurf/rules/brightdata.md https://docs.brightdata.com/llms.txt

Now every Cursor Composer or Windsurf Cascade session has Bright Data product awareness built in.

import httpx
from langchain.text_splitter import MarkdownTextSplitter

# Fetch the full docs
response = httpx.get("https://docs.brightdata.com/llms-full.txt")
docs_content = response.text

# Split into chunks
splitter = MarkdownTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks = splitter.create_documents([docs_content])

# Add to your vector store
vectorstore.add_documents(chunks)

import httpx

# Fetch the index
llms_txt = httpx.get("https://docs.brightdata.com/llms.txt").text

messages = [
    {
        "role": "system",
        "content": f"""You are a helpful assistant with expertise in Bright Data's web data APIs.

Here is the full index of available documentation:
{llms_txt}

Use this to understand which products and APIs are available, then fetch
specific pages when you need full details on a product."""
    },
    {"role": "user", "content": "How do I scrape Amazon product pages?"}
]

Per-page markdown access

Every Bright Data documentation page is also available as clean markdown. Append .md to any page URL:

Page	Markdown URL
`docs.brightdata.com/ai/agents`	`docs.brightdata.com/ai/agents.md`
`docs.brightdata.com/scraping-automation/web-unlocker/introduction`	`...web-unlocker/introduction.md`
`docs.brightdata.com/ai/mcp-server/overview`	`...mcp-server/overview.md`

This lets agents fetch specific pages on demand without parsing any HTML.

Recommended pattern for agents: Load llms.txt to understand what’s available → identify the relevant page → fetch that page’s .md URL for full details. This keeps token usage efficient while giving the agent complete information when needed.

Introduction

Quick Start

For AI Agents

Use Cases

Integrations

LLM-friendly documentation references

Overview

llms.txt - Documentation Index

llms-full.txt - Complete Documentation

Loading into your agent

Per-page markdown access

Next steps

Install a Skill

Connect the Docs MCP

​Overview

​llms.txt - Documentation Index

​llms-full.txt - Complete Documentation

​Loading into your agent

​Per-page markdown access

​Next steps

Install a Skill

Connect the Docs MCP

Overview

llms.txt - Documentation Index

llms-full.txt - Complete Documentation

Loading into your agent

Per-page markdown access

Next steps