# Reader ## Docs - [Get credit balance](https://docs.reader.dev/api-reference/account/credits.md) - [Usage history](https://docs.reader.dev/api-reference/account/history.md) - [Authentication](https://docs.reader.dev/api-reference/authentication.md): Every Reader API request is authenticated with an API key passed in the x-api-key header. - [Cancel a job](https://docs.reader.dev/api-reference/jobs/cancel.md) - [Events stream (SSE)](https://docs.reader.dev/api-reference/jobs/events.md): Long-lived Server-Sent Events connection that emits real-time progress, completion, and failure events for every job in the workspace. Use this for dashboards and realtime UIs. For a single job, use `GET /v1/jobs/{id}/stream` instead. - [Get a job](https://docs.reader.dev/api-reference/jobs/get.md) - [Retry failed URLs](https://docs.reader.dev/api-reference/jobs/retry.md) - [Per-job events stream (SSE)](https://docs.reader.dev/api-reference/jobs/stream.md): Server-Sent Events stream scoped to a single job. Emits `progress`, `page`, `error`, and `done` events as the job makes progress. Stream closes automatically when the job reaches a terminal state. - [Scrape, batch, or crawl](https://docs.reader.dev/api-reference/read.md): Unified endpoint for all read operations. Pass `url` for a single scrape, `urls` for a batch, or `url` + `maxPages` to crawl a site. - [CDP WebSocket Proxy](https://docs.reader.dev/api-reference/sessions/cdp.md): Connect Playwright/Puppeteer to a browser session via CDP - [Create Session](https://docs.reader.dev/api-reference/sessions/create.md): Create a browser session with a CDP WebSocket endpoint - [Get Session](https://docs.reader.dev/api-reference/sessions/get.md): Get the status of a browser session - [List Sessions](https://docs.reader.dev/api-reference/sessions/list.md): List active browser sessions for your workspace - [Stop Session](https://docs.reader.dev/api-reference/sessions/stop.md): Stop a browser session and get billing summary - [Create a webhook](https://docs.reader.dev/api-reference/webhooks/create.md) - [Delete a webhook](https://docs.reader.dev/api-reference/webhooks/delete.md) - [List webhooks](https://docs.reader.dev/api-reference/webhooks/list.md) - [Update a webhook](https://docs.reader.dev/api-reference/webhooks/update.md) - [Async jobs](https://docs.reader.dev/home/concepts/async-jobs.md): Batch and crawl return jobs. Poll them, stream them, or let a webhook call you when they're done. - [Browser Sessions](https://docs.reader.dev/home/concepts/browser-sessions.md): Launch stealthed browsers for full automation with Playwright and Puppeteer - [Caching](https://docs.reader.dev/home/concepts/caching.md): Free cache hits, 24-hour default TTL, and how to opt out. - [Credits and billing](https://docs.reader.dev/home/concepts/credits-and-billing.md): What each Reader operation costs and how to check your balance. - [Errors](https://docs.reader.dev/home/concepts/errors.md): Eleven stable error codes. Branch on code, not on message. - [Events](https://docs.reader.dev/home/concepts/events.md): Server-sent streams for real-time updates on one job or across your whole workspace. - [Formats and extraction](https://docs.reader.dev/home/concepts/formats-and-extraction.md): What Reader returns, how to shape it, and how to trim boilerplate. - [Proxy modes](https://docs.reader.dev/home/concepts/proxy-modes.md): Three ways Reader can fetch a page: standard, stealth, and auto. Pick one, or let Reader decide. - [Rate limits and concurrency](https://docs.reader.dev/home/concepts/rate-limits.md): Two limits: requests per minute and concurrent async jobs. How Reader enforces them and how to back off. - [The read primitive](https://docs.reader.dev/home/concepts/read-primitive.md): One endpoint, three shapes of input. Scrape, batch, and crawl all flow through POST /v1/read. - [Scrape vs crawl](https://docs.reader.dev/home/concepts/scrape-vs-crawl.md): Scrape fetches URLs you already know. Crawl discovers URLs by following links. Both return the same page shape. - [Webhooks](https://docs.reader.dev/home/concepts/webhooks.md): Push notifications when jobs finish. Signed, retried, and verifiable. - [Auth walls and hostile sites](https://docs.reader.dev/home/guides/advanced/auth-walls.md): When to force stealth, what Reader can't do, and how to spot a site that won't work. - [Batch scraping](https://docs.reader.dev/home/guides/advanced/batch-scraping.md): Submit many URLs in one call, handle results efficiently, and stay within your limits. - [Choosing a proxy mode](https://docs.reader.dev/home/guides/advanced/choosing-a-proxy-mode.md): A practical decision guide: when to use standard, stealth, or auto for different sites. - [Crawl + scrape in one call](https://docs.reader.dev/home/guides/advanced/crawl-with-scrape.md): Discover links and fetch full page content in a single job. - [Crawling a website](https://docs.reader.dev/home/guides/advanced/crawling.md): Start with one URL, let Reader discover the rest. - [Waiting for dynamic content](https://docs.reader.dev/home/guides/advanced/dynamic-content.md): Use waitForSelector when content is hydrated client-side after initial load. - [Polling a job to completion](https://docs.reader.dev/home/guides/async-workflows/polling-jobs.md): The simplest async pattern: fetch status on an interval until terminal. - [Polling vs SSE vs webhooks](https://docs.reader.dev/home/guides/async-workflows/polling-vs-sse-vs-webhooks.md): The same job can be watched three ways. Which one fits your use case? - [Building a progress bar](https://docs.reader.dev/home/guides/async-workflows/progress-bars.md): Wire SSE events to a CLI or UI progress indicator. - [Reliable batch processing](https://docs.reader.dev/home/guides/async-workflows/reliable-batch.md): Submit thousands of URLs, survive reconnects, and never lose a result. - [Streaming jobs with SSE](https://docs.reader.dev/home/guides/async-workflows/sse-streaming.md): Real-time progress from a single open HTTP connection. - [Verifying webhook signatures](https://docs.reader.dev/home/guides/async-workflows/verifying-webhooks.md): HMAC-SHA256 verification for Express, FastAPI, and Next.js. - [Webhook retries and idempotency](https://docs.reader.dev/home/guides/async-workflows/webhook-retries.md): Reader retries failed deliveries. Your handler needs to be idempotent. - [Webhook-driven workflows](https://docs.reader.dev/home/guides/async-workflows/webhook-workflows.md): Trigger downstream work when a job completes: database writes, emails, re-indexing. - [Your first scrape](https://docs.reader.dev/home/guides/getting-started/first-scrape.md): Make a single-URL request and understand every field in the response. - [Main content extraction](https://docs.reader.dev/home/guides/getting-started/main-content.md): How Reader strips boilerplate and when to turn it off. - [Markdown and HTML together](https://docs.reader.dev/home/guides/getting-started/markdown-and-html.md): Request both formats in one call and pick what you need downstream. - [Include and exclude selectors](https://docs.reader.dev/home/guides/getting-started/selectors.md): Precise control over what Reader keeps and drops using CSS selectors. - [Reader as an agent tool](https://docs.reader.dev/home/guides/llm/agent-tool.md): Expose Reader via LLM tool calling so your agent can fetch pages on demand. - [Feeding Reader output to an LLM](https://docs.reader.dev/home/guides/llm/llm-pipeline.md): From URL to model prompt: the minimal pipeline and what to watch out for. - [RAG: scrape, chunk, embed](https://docs.reader.dev/home/guides/llm/rag.md): Build a retrieval-augmented pipeline from URL to vector database. - [Structured data extraction](https://docs.reader.dev/home/guides/llm/structured-data.md): From markdown to JSON. Extract fields with an LLM on top of Reader. - [Cost estimation](https://docs.reader.dev/home/guides/production/cost-estimation.md): Predict what a batch or crawl will cost before you run it. - [Handling credit exhaustion](https://docs.reader.dev/home/guides/production/credit-exhaustion.md): Detect low balance before you run out and degrade gracefully when you do. - [Monitoring Reader in production](https://docs.reader.dev/home/guides/production/monitoring.md): The metrics worth tracking, where to get them, and what to alert on. - [Handling rate limits](https://docs.reader.dev/home/guides/production/rate-limits.md): Design for your tier's RPM, back off gracefully, and avoid the common footguns. - [Retry and error handling](https://docs.reader.dev/home/guides/production/retry-error-handling.md): Which error codes to retry, how to back off, and when to give up. - [Cache not hitting](https://docs.reader.dev/home/guides/troubleshooting/cache-miss.md): Why repeated requests for the same URL still cost credits, and how to fix it. - [Cloudflare and bot walls](https://docs.reader.dev/home/guides/troubleshooting/cloudflare.md): When Reader gets through automatically, when to force stealth, and when a site is out of reach. - [URL blocked errors](https://docs.reader.dev/home/guides/troubleshooting/ssrf-blocks.md): Why Reader refuses some URLs and what to do about it. - [Stuck jobs](https://docs.reader.dev/home/guides/troubleshooting/stuck-jobs.md): What to do when a job sits in `processing` longer than you expect. - [Introduction](https://docs.reader.dev/home/introduction.md): The web infrastructure platform for AI. Scrape, crawl, and automate the web from a single API. - [Quickstart](https://docs.reader.dev/home/quickstart.md): Make your first Reader request in under a minute. - [JavaScript SDK](https://docs.reader.dev/sdk/javascript.md): Install and use @vakra-dev/reader-js - [SDKs](https://docs.reader.dev/sdk/overview.md): Official SDKs for the Reader API in JavaScript/TypeScript and Python. - [Python SDK](https://docs.reader.dev/sdk/python.md): Install and use reader-py - [Browser Session](https://docs.reader.dev/self-hosted/api-reference/browser-session.md): API reference for browser() method, BrowserOptions, and BrowserSession - [crawl()](https://docs.reader.dev/self-hosted/api-reference/crawl.md): Discover and optionally scrape pages on a website via BFS. - [CrawlOptions](https://docs.reader.dev/self-hosted/api-reference/crawl-options.md): Every field accepted by crawl() with type and default. - [CrawlResult](https://docs.reader.dev/self-hosted/api-reference/crawl-result.md): Return type for crawl() - discovered URLs, optional scrape data, and metadata. - [Errors](https://docs.reader.dev/self-hosted/api-reference/errors.md): Full reference of every error class thrown by Reader. - [ReaderClient](https://docs.reader.dev/self-hosted/api-reference/reader-client.md): Constructor, options, methods, and lifecycle for the main Reader API. - [scrape()](https://docs.reader.dev/self-hosted/api-reference/scrape.md): Scrape one or more URLs and return cleaned content. - [ScrapeOptions](https://docs.reader.dev/self-hosted/api-reference/scrape-options.md): Every field accepted by scrape() with type and default. - [ScrapeResult](https://docs.reader.dev/self-hosted/api-reference/scrape-result.md): Return type for scrape() - data array plus batch metadata. - [Browser Pool](https://docs.reader.dev/self-hosted/concepts/browser-pool.md): How Reader manages long-lived browser instances with recycling and health checks. - [Browser Sessions](https://docs.reader.dev/self-hosted/concepts/browser-sessions.md): Launch stealthed Chrome for Playwright/Puppeteer automation - [Content Extraction](https://docs.reader.dev/self-hosted/concepts/content-extraction.md): How Reader identifies main content and strips noise from scraped pages. - [Crawling](https://docs.reader.dev/self-hosted/concepts/crawling.md): How Reader discovers and optionally scrapes every page on a site. - [Scraping Engine](https://docs.reader.dev/self-hosted/concepts/engine-waterfall.md): Hero browser engine with proxy escalation for reliable scraping. - [Error Handling](https://docs.reader.dev/self-hosted/concepts/error-handling.md): Typed errors, proxy escalation, and what happens when a scrape fails. - [Proxy Tiers](https://docs.reader.dev/self-hosted/concepts/proxy-tiers.md): Datacenter vs residential proxies, rotation, and automatic escalation. - [Scraping](https://docs.reader.dev/self-hosted/concepts/scraping.md): How Reader turns a URL into clean, LLM-ready content. - [Examples](https://docs.reader.dev/self-hosted/getting-started/examples.md): Runnable code examples for every feature of self-hosted Reader. - [Installation](https://docs.reader.dev/self-hosted/getting-started/installation.md): Install @vakra-dev/reader and its system dependencies. - [Introduction](https://docs.reader.dev/self-hosted/getting-started/introduction.md): Run the Reader scraping engine yourself - open-source Node library, CLI, and deployment scripts. - [Quickstart](https://docs.reader.dev/self-hosted/getting-started/quickstart.md): Make your first scrape with self-hosted Reader in 60 seconds. - [Basic Scraping](https://docs.reader.dev/self-hosted/guides/basic-scraping.md): Practical recipes for single-URL and small-batch scrapes. - [Batch Scraping](https://docs.reader.dev/self-hosted/guides/batch-scraping.md): Scrape many URLs in parallel with concurrency, progress tracking, and partial failure handling. - [Browser Sessions](https://docs.reader.dev/self-hosted/guides/browser-sessions.md): Full examples for Playwright, Puppeteer, and raw CDP browser automation - [Using the CLI](https://docs.reader.dev/self-hosted/guides/cli.md): One-off scrapes and crawls from the terminal. - [Daemon Mode](https://docs.reader.dev/self-hosted/guides/daemon-mode.md): Run Reader as a long-lived background process with a shared browser pool. - [Deployment](https://docs.reader.dev/self-hosted/guides/deployment.md): Run Reader in production with Docker or bare metal. - [Proxy Configuration](https://docs.reader.dev/self-hosted/guides/proxy-configuration.md): Set up single proxies, rotation pools, and multi-tier escalation. - [Website Crawling](https://docs.reader.dev/self-hosted/guides/website-crawling.md): Discover and scrape every page on a site with BFS link discovery. ## OpenAPI Specs - [openapi](https://docs.reader.dev/openapi.json) ## Optional - [GitHub](https://github.com/vakra-dev/reader) - [Discord](https://discord.gg/6tjkq7J5WV)