# Reader

## Docs

- [Get credit balance](https://docs.reader.dev/api-reference/account/credits.md)
- [Usage history](https://docs.reader.dev/api-reference/account/history.md)
- [Authentication](https://docs.reader.dev/api-reference/authentication.md): Every Reader API request is authenticated with an API key passed in the x-api-key header.
- [Cancel a job](https://docs.reader.dev/api-reference/jobs/cancel.md)
- [Events stream (SSE)](https://docs.reader.dev/api-reference/jobs/events.md): Long-lived Server-Sent Events connection that emits real-time progress, completion, and failure events for every job in the workspace. Use this for dashboards and realtime UIs. For a single job, use `GET /v1/jobs/{id}/stream` instead.
- [Get a job](https://docs.reader.dev/api-reference/jobs/get.md)
- [Retry failed URLs](https://docs.reader.dev/api-reference/jobs/retry.md)
- [Per-job events stream (SSE)](https://docs.reader.dev/api-reference/jobs/stream.md): Server-Sent Events stream scoped to a single job. Emits `progress`, `page`, `error`, and `done` events as the job makes progress. Stream closes automatically when the job reaches a terminal state.
- [Scrape, batch, or crawl](https://docs.reader.dev/api-reference/read.md): Unified endpoint for all read operations. Pass `url` for a single scrape, `urls` for a batch, or `url` + `maxPages` to crawl a site.
- [CDP WebSocket Proxy](https://docs.reader.dev/api-reference/sessions/cdp.md): Connect Playwright/Puppeteer to a browser session via CDP
- [Create Session](https://docs.reader.dev/api-reference/sessions/create.md): Create a browser session with a CDP WebSocket endpoint
- [Get Session](https://docs.reader.dev/api-reference/sessions/get.md): Get the status of a browser session
- [List Sessions](https://docs.reader.dev/api-reference/sessions/list.md): List active browser sessions for your workspace
- [Stop Session](https://docs.reader.dev/api-reference/sessions/stop.md): Stop a browser session and get billing summary
- [Create a webhook](https://docs.reader.dev/api-reference/webhooks/create.md)
- [Delete a webhook](https://docs.reader.dev/api-reference/webhooks/delete.md)
- [List webhooks](https://docs.reader.dev/api-reference/webhooks/list.md)
- [Update a webhook](https://docs.reader.dev/api-reference/webhooks/update.md)
- [Async jobs](https://docs.reader.dev/home/concepts/async-jobs.md): Batch and crawl return jobs. Poll them, stream them, or let a webhook call you when they're done.
- [Browser Sessions](https://docs.reader.dev/home/concepts/browser-sessions.md): Launch stealthed browsers for full automation with Playwright and Puppeteer
- [Caching](https://docs.reader.dev/home/concepts/caching.md): Free cache hits, 24-hour default TTL, and how to opt out.
- [Credits and billing](https://docs.reader.dev/home/concepts/credits-and-billing.md): What each Reader operation costs and how to check your balance.
- [Errors](https://docs.reader.dev/home/concepts/errors.md): Eleven stable error codes. Branch on code, not on message.
- [Events](https://docs.reader.dev/home/concepts/events.md): Server-sent streams for real-time updates on one job or across your whole workspace.
- [Formats and extraction](https://docs.reader.dev/home/concepts/formats-and-extraction.md): What Reader returns, how to shape it, and how to trim boilerplate.
- [Proxy modes](https://docs.reader.dev/home/concepts/proxy-modes.md): Three ways Reader can fetch a page: standard, stealth, and auto. Pick one, or let Reader decide.
- [Rate limits and concurrency](https://docs.reader.dev/home/concepts/rate-limits.md): Two limits: requests per minute and concurrent async jobs. How Reader enforces them and how to back off.
- [The read primitive](https://docs.reader.dev/home/concepts/read-primitive.md): One endpoint, three shapes of input. Scrape, batch, and crawl all flow through POST /v1/read.
- [Scrape vs crawl](https://docs.reader.dev/home/concepts/scrape-vs-crawl.md): Scrape fetches URLs you already know. Crawl discovers URLs by following links. Both return the same page shape.
- [Webhooks](https://docs.reader.dev/home/concepts/webhooks.md): Push notifications when jobs finish. Signed, retried, and verifiable.
- [Auth walls and hostile sites](https://docs.reader.dev/home/guides/advanced/auth-walls.md): When to force stealth, what Reader can't do, and how to spot a site that won't work.
- [Batch scraping](https://docs.reader.dev/home/guides/advanced/batch-scraping.md): Submit many URLs in one call, handle results efficiently, and stay within your limits.
- [Choosing a proxy mode](https://docs.reader.dev/home/guides/advanced/choosing-a-proxy-mode.md): A practical decision guide: when to use standard, stealth, or auto for different sites.
- [Crawl + scrape in one call](https://docs.reader.dev/home/guides/advanced/crawl-with-scrape.md): Discover links and fetch full page content in a single job.
- [Crawling a website](https://docs.reader.dev/home/guides/advanced/crawling.md): Start with one URL, let Reader discover the rest.
- [Waiting for dynamic content](https://docs.reader.dev/home/guides/advanced/dynamic-content.md): Use waitForSelector when content is hydrated client-side after initial load.
- [Polling a job to completion](https://docs.reader.dev/home/guides/async-workflows/polling-jobs.md): The simplest async pattern: fetch status on an interval until terminal.
- [Polling vs SSE vs webhooks](https://docs.reader.dev/home/guides/async-workflows/polling-vs-sse-vs-webhooks.md): The same job can be watched three ways. Which one fits your use case?
- [Building a progress bar](https://docs.reader.dev/home/guides/async-workflows/progress-bars.md): Wire SSE events to a CLI or UI progress indicator.
- [Reliable batch processing](https://docs.reader.dev/home/guides/async-workflows/reliable-batch.md): Submit thousands of URLs, survive reconnects, and never lose a result.
- [Streaming jobs with SSE](https://docs.reader.dev/home/guides/async-workflows/sse-streaming.md): Real-time progress from a single open HTTP connection.
- [Verifying webhook signatures](https://docs.reader.dev/home/guides/async-workflows/verifying-webhooks.md): HMAC-SHA256 verification for Express, FastAPI, and Next.js.
- [Webhook retries and idempotency](https://docs.reader.dev/home/guides/async-workflows/webhook-retries.md): Reader retries failed deliveries. Your handler needs to be idempotent.
- [Webhook-driven workflows](https://docs.reader.dev/home/guides/async-workflows/webhook-workflows.md): Trigger downstream work when a job completes: database writes, emails, re-indexing.
- [Your first scrape](https://docs.reader.dev/home/guides/getting-started/first-scrape.md): Make a single-URL request and understand every field in the response.
- [Main content extraction](https://docs.reader.dev/home/guides/getting-started/main-content.md): How Reader strips boilerplate and when to turn it off.
- [Markdown and HTML together](https://docs.reader.dev/home/guides/getting-started/markdown-and-html.md): Request both formats in one call and pick what you need downstream.
- [Include and exclude selectors](https://docs.reader.dev/home/guides/getting-started/selectors.md): Precise control over what Reader keeps and drops using CSS selectors.
- [Reader as an agent tool](https://docs.reader.dev/home/guides/llm/agent-tool.md): Expose Reader via LLM tool calling so your agent can fetch pages on demand.
- [Feeding Reader output to an LLM](https://docs.reader.dev/home/guides/llm/llm-pipeline.md): From URL to model prompt: the minimal pipeline and what to watch out for.
- [RAG: scrape, chunk, embed](https://docs.reader.dev/home/guides/llm/rag.md): Build a retrieval-augmented pipeline from URL to vector database.
- [Structured data extraction](https://docs.reader.dev/home/guides/llm/structured-data.md): From markdown to JSON. Extract fields with an LLM on top of Reader.
- [Cost estimation](https://docs.reader.dev/home/guides/production/cost-estimation.md): Predict what a batch or crawl will cost before you run it.
- [Handling credit exhaustion](https://docs.reader.dev/home/guides/production/credit-exhaustion.md): Detect low balance before you run out and degrade gracefully when you do.
- [Monitoring Reader in production](https://docs.reader.dev/home/guides/production/monitoring.md): The metrics worth tracking, where to get them, and what to alert on.
- [Handling rate limits](https://docs.reader.dev/home/guides/production/rate-limits.md): Design for your tier's RPM, back off gracefully, and avoid the common footguns.
- [Retry and error handling](https://docs.reader.dev/home/guides/production/retry-error-handling.md): Which error codes to retry, how to back off, and when to give up.
- [Cache not hitting](https://docs.reader.dev/home/guides/troubleshooting/cache-miss.md): Why repeated requests for the same URL still cost credits, and how to fix it.
- [Cloudflare and bot walls](https://docs.reader.dev/home/guides/troubleshooting/cloudflare.md): When Reader gets through automatically, when to force stealth, and when a site is out of reach.
- [URL blocked errors](https://docs.reader.dev/home/guides/troubleshooting/ssrf-blocks.md): Why Reader refuses some URLs and what to do about it.
- [Stuck jobs](https://docs.reader.dev/home/guides/troubleshooting/stuck-jobs.md): What to do when a job sits in `processing` longer than you expect.
- [Introduction](https://docs.reader.dev/home/introduction.md): The web infrastructure platform for AI. Scrape, crawl, and automate the web from a single API.
- [Quickstart](https://docs.reader.dev/home/quickstart.md): Make your first Reader request in under a minute.
- [JavaScript SDK](https://docs.reader.dev/sdk/javascript.md): Install and use @vakra-dev/reader-js
- [SDKs](https://docs.reader.dev/sdk/overview.md): Official SDKs for the Reader API in JavaScript/TypeScript and Python.
- [Python SDK](https://docs.reader.dev/sdk/python.md): Install and use reader-py
- [Browser Session](https://docs.reader.dev/self-hosted/api-reference/browser-session.md): API reference for browser() method, BrowserOptions, and BrowserSession
- [crawl()](https://docs.reader.dev/self-hosted/api-reference/crawl.md): Discover and optionally scrape pages on a website via BFS.
- [CrawlOptions](https://docs.reader.dev/self-hosted/api-reference/crawl-options.md): Every field accepted by crawl() with type and default.
- [CrawlResult](https://docs.reader.dev/self-hosted/api-reference/crawl-result.md): Return type for crawl() - discovered URLs, optional scrape data, and metadata.
- [Errors](https://docs.reader.dev/self-hosted/api-reference/errors.md): Full reference of every error class thrown by Reader.
- [ReaderClient](https://docs.reader.dev/self-hosted/api-reference/reader-client.md): Constructor, options, methods, and lifecycle for the main Reader API.
- [scrape()](https://docs.reader.dev/self-hosted/api-reference/scrape.md): Scrape one or more URLs and return cleaned content.
- [ScrapeOptions](https://docs.reader.dev/self-hosted/api-reference/scrape-options.md): Every field accepted by scrape() with type and default.
- [ScrapeResult](https://docs.reader.dev/self-hosted/api-reference/scrape-result.md): Return type for scrape() - data array plus batch metadata.
- [Browser Pool](https://docs.reader.dev/self-hosted/concepts/browser-pool.md): How Reader manages long-lived browser instances with recycling and health checks.
- [Browser Sessions](https://docs.reader.dev/self-hosted/concepts/browser-sessions.md): Launch stealthed Chrome for Playwright/Puppeteer automation
- [Content Extraction](https://docs.reader.dev/self-hosted/concepts/content-extraction.md): How Reader identifies main content and strips noise from scraped pages.
- [Crawling](https://docs.reader.dev/self-hosted/concepts/crawling.md): How Reader discovers and optionally scrapes every page on a site.
- [Scraping Engine](https://docs.reader.dev/self-hosted/concepts/engine-waterfall.md): Hero browser engine with proxy escalation for reliable scraping.
- [Error Handling](https://docs.reader.dev/self-hosted/concepts/error-handling.md): Typed errors, proxy escalation, and what happens when a scrape fails.
- [Proxy Tiers](https://docs.reader.dev/self-hosted/concepts/proxy-tiers.md): Datacenter vs residential proxies, rotation, and automatic escalation.
- [Scraping](https://docs.reader.dev/self-hosted/concepts/scraping.md): How Reader turns a URL into clean, LLM-ready content.
- [Examples](https://docs.reader.dev/self-hosted/getting-started/examples.md): Runnable code examples for every feature of self-hosted Reader.
- [Installation](https://docs.reader.dev/self-hosted/getting-started/installation.md): Install @vakra-dev/reader and its system dependencies.
- [Introduction](https://docs.reader.dev/self-hosted/getting-started/introduction.md): Run the Reader scraping engine yourself - open-source Node library, CLI, and deployment scripts.
- [Quickstart](https://docs.reader.dev/self-hosted/getting-started/quickstart.md): Make your first scrape with self-hosted Reader in 60 seconds.
- [Basic Scraping](https://docs.reader.dev/self-hosted/guides/basic-scraping.md): Practical recipes for single-URL and small-batch scrapes.
- [Batch Scraping](https://docs.reader.dev/self-hosted/guides/batch-scraping.md): Scrape many URLs in parallel with concurrency, progress tracking, and partial failure handling.
- [Browser Sessions](https://docs.reader.dev/self-hosted/guides/browser-sessions.md): Full examples for Playwright, Puppeteer, and raw CDP browser automation
- [Using the CLI](https://docs.reader.dev/self-hosted/guides/cli.md): One-off scrapes and crawls from the terminal.
- [Daemon Mode](https://docs.reader.dev/self-hosted/guides/daemon-mode.md): Run Reader as a long-lived background process with a shared browser pool.
- [Deployment](https://docs.reader.dev/self-hosted/guides/deployment.md): Run Reader in production with Docker or bare metal.
- [Proxy Configuration](https://docs.reader.dev/self-hosted/guides/proxy-configuration.md): Set up single proxies, rotation pools, and multi-tier escalation.
- [Website Crawling](https://docs.reader.dev/self-hosted/guides/website-crawling.md): Discover and scrape every page on a site with BFS link discovery.

## OpenAPI Specs

- [openapi](https://docs.reader.dev/openapi.json)

## Optional

- [GitHub](https://github.com/vakra-dev/reader)
- [Discord](https://discord.gg/6tjkq7J5WV)