Installation
The CLI is included with the Reader package:Scrape Command
Basic Usage
Output Formats
Concurrency
Timeouts
Content Extraction
Proxy
Debugging
All Options
| Option | Short | Default | Description |
|---|---|---|---|
--format | -f | markdown | Output formats (comma-separated) |
--output | -o | stdout | Output file path |
--concurrency | -c | 1 | Parallel requests |
--timeout | -t | 30000 | Per-page timeout (ms) |
--batch-timeout | 300000 | Total batch timeout (ms) | |
--proxy | Proxy URL | ||
--user-agent | Custom user agent | ||
--no-main-content | Include full page | ||
--include-tags | CSS selectors to include | ||
--exclude-tags | CSS selectors to exclude | ||
--show-chrome | Show browser window | ||
--verbose | -v | Enable logging | |
--standalone | Bypass daemon |
Crawl Command
Basic Usage
Depth and Limits
Scrape Content
URL Filtering
Rate Limiting
All Options
| Option | Short | Default | Description |
|---|---|---|---|
--depth | -d | 1 | Maximum crawl depth |
--max-pages | -m | 20 | Maximum pages to discover |
--scrape | -s | Scrape content | |
--format | -f | markdown | Output formats |
--output | -o | stdout | Output file path |
--delay | 1000 | Delay between requests (ms) | |
--timeout | -t | Total crawl timeout (ms) | |
--include | URL patterns to include | ||
--exclude | URL patterns to exclude | ||
--proxy | Proxy URL | ||
--user-agent | Custom user agent | ||
--show-chrome | Show browser window | ||
--verbose | -v | Enable logging |
Daemon Mode
For multiple requests, use daemon mode to keep the browser pool warm:Start Daemon
Check Status
Stop Daemon
Auto-Connect
When a daemon is running, CLI commands automatically connect to it:Output Format
CLI output is always JSON with the following structure:Scrape Output
Crawl Output
Examples
Scrape and process with jq
Save crawl results
Batch scrape from file
Next Steps
Basic Scraping
Learn programmatic scraping
Deployment
Deploy Reader in production

