reader-py is the official Python SDK for the Reader API. It wraps the HTTP contract, parses responses into Pydantic models, polls async jobs to completion, raises typed exceptions, and retries transient failures.
Installation
Quick start
Async client
ReaderClient has an awaitable equivalent on AsyncReaderClient.
Configuration
Scraping
Single URL, synchronous
Single-URL requests return immediately withReadResult(kind="scrape", data=ScrapeResult).
Multiple URLs (batch)
Passingurls creates an async job. The SDK auto-polls until the job terminates and returns ReadResult(kind="job", data=Job) with all results collected across pagination.
Crawl
Same shape as batch, but withmax_depth or max_pages:
Proxy mode
Control how aggressively Reader bypasses bot walls withproxy_mode:
Job management
The SDK’sread() method auto-polls batches and crawls, so most callers never need to touch job APIs directly. When you do:
Streaming
For real-time progress updates on a job, useclient.stream(job_id), a generator that yields StreamEvent instances as the job makes progress.
AsyncReaderClient.stream() returns an async generator. Use async for with it.
Credits
Error handling
Every error response from the API is parsed into a specificReaderApiError subclass. Catch the specific class rather than checking HTTP status codes.
code: one of 11 stable codes (e.g."insufficient_credits","rate_limited")http_status: the HTTP status codedetails: dict with error-specific fieldsdocs_url: deep link to the error’s documentationrequest_id: thex-request-idheader from the response, for support tickets
Backwards compatibility
ReaderError is re-exported as an alias for ReaderApiError so code written against the 0.1 SDK continues to work. New code should use ReaderApiError directly.
Automatic retries
The SDK retries these codes automatically with exponential backoff before raising:rate_limited (honors Retry-After), concurrency_limited, internal_error, upstream_unavailable, scrape_timeout. All other codes raise immediately.
Webhooks per request
Everyread() call can include an inline webhook config that fires on job lifecycle events, useful for fire-and-forget batches.
Types
All public types are re-exported from the package root:BaseModel subclasses with snake_case field names. The SDK internally translates to/from the API’s camelCase.
Next
- JavaScript SDK: same features, TypeScript API
- Errors: full error code catalog
- Proxy modes: when to override the default
- API reference: the raw HTTP contract

