Core Concepts
Scraping
Scraping is the process of fetching and extracting content from URLs. Reader handles:- Loading pages in a real browser
- Waiting for dynamic content
- Extracting main content
- Converting HTML to clean markdown
Crawling
Crawling is the process of discovering pages on a website. Reader uses breadth-first search to find links and can optionally scrape the content of discovered pages.Content Extraction
Reader automatically extracts the main content from web pages, removing navigation, headers, footers, ads, and other non-content elements. Learn more about content extraction →Browser Pool
For high-volume scraping, Reader manages a pool of browser instances with automatic recycling and health monitoring. Learn more about browser pool →Architecture
Guides
Basic Scraping
Scrape single and multiple URLs
Batch Scraping
High-volume concurrent scraping
Website Crawling
Discover and scrape entire websites
Proxy Configuration
Set up proxy rotation
CLI Usage
Use Reader from the command line
Deployment
Deploy Reader in production

