/examples. Clone the repo to follow along:
tsx:
Basics
Basic scrape
Scrape a single URL and print the markdown result.
All formats
Request markdown and HTML in one call.
Batch scrape
Scrape multiple URLs in parallel with progress tracking.
Large batch scrape
Scale up concurrency and tune the pool for hundreds of URLs.
Crawl a website
Discover and scrape every page on a domain.
Browser pool config
Tune pool size, recycling, and health checks.
Proxies & bypass
Single proxy
Pass a single proxy per request.
Proxy pool rotation
Rotate through a pool with round-robin or random strategy.
Cloudflare bypass
Scrape sites behind Cloudflare challenges.
AI & LLM integrations
Anthropic summary
Pipe scraped markdown to Claude for summarization.
OpenAI summary
Same pattern with GPT-4 or GPT-4o-mini.
Vercel AI streaming
Stream scrape + LLM output with the Vercel AI SDK.
LangChain loader
Use Reader as a LangChain document loader.
LlamaIndex loader
Use Reader as a LlamaIndex data connector.
Pinecone ingest
Scrape, chunk, embed, and upsert to Pinecone.
Qdrant ingest
Same pattern with Qdrant as the vector store.
Production patterns
Express server
Embed Reader in an HTTP API with a persistent client.
BullMQ job queue
Process scrape jobs from a Redis-backed queue with retries.
Browser pool scaling
Patterns for scaling the pool under real load.
Deployment
Docker Compose
Production Dockerfile + compose with correct shm_size and capabilities.
AWS Lambda
Deploy Reader as a serverless function handler.
Vercel Functions
Run Reader from a Vercel Edge/Node function.
Where to go next
Quickstart
Walk through your first scrape in more depth.
API Reference
Full type reference for every option.

