Skip to main content
POST
/
v1
/
read
Scrape, batch, or crawl
curl --request POST \
  --url https://api.reader.dev/v1/read \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "url": "https://example.com",
  "urls": [
    "<string>"
  ],
  "formats": [
    "markdown"
  ],
  "onlyMainContent": true,
  "includeTags": [
    "<string>"
  ],
  "excludeTags": [
    "<string>"
  ],
  "waitForSelector": "<string>",
  "timeoutMs": 60500,
  "proxyMode": "auto",
  "batchConcurrency": 10,
  "maxDepth": 5,
  "maxPages": 5000,
  "cache": true
}
'
{
  "success": true,
  "data": {
    "url": "https://example.com",
    "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.\n\n[More information...](https://www.iana.org/domains/example)",
    "metadata": {
      "title": "Example Domain",
      "description": null,
      "statusCode": 200,
      "duration": 487,
      "cached": false,
      "proxyMode": "standard",
      "proxyEscalated": false,
      "scrapedAt": "2026-04-04T12:00:00Z"
    }
  }
}
POST /v1/read is the unified content-extraction endpoint. It auto-detects the operation from the body:
  • Single url → synchronous scrape, returned immediately in the response
  • Multiple urls → async batch job
  • url + maxDepth or maxPages → async crawl job
See The read primitive for a narrative overview.

Proxy mode

Set proxyMode to "standard" (1 credit, fast), "stealth" (3 credits, bypasses bot walls), or "auto" (default, which tries standard first and escalates to stealth on block). The response metadata tells you which mode actually ran. See Proxy modes.

Idempotency

Pass an x-idempotency-key header to deduplicate retried POSTs. Reader caches the original response for 24 hours and returns it verbatim on any subsequent request with the same key.
curl -X POST https://api.reader.dev/v1/read \
  -H "x-api-key: $READER_KEY" \
  -H "x-idempotency-key: batch-2026-04-04-run-1" \
  -H "Content-Type: application/json" \
  -d '{ "urls": ["https://example.com/a", "https://example.com/b"] }'

Authorizations

x-api-key
string
header
required

Body

application/json
url
string<uri>
Example:

"https://example.com"

urls
string<uri>[]
Required array length: 1 - 1000 elements
formats
enum<string>[]

Content formats to include in the response.

Available options:
markdown,
html
Example:
["markdown"]
onlyMainContent
boolean

Strip navigation, footers, and boilerplate. Default: true.

includeTags
string[]
excludeTags
string[]
waitForSelector
string
timeoutMs
integer
Required range: 1000 <= x <= 120000
proxyMode
enum<string>

Proxy mode for the scrape. standard is fast and affordable; stealth bypasses aggressive bot walls; auto picks automatically and only escalates to stealth when a block is detected. Default: auto.

Available options:
standard,
stealth,
auto
Example:

"auto"

batchConcurrency
integer
Required range: 1 <= x <= 20
maxDepth
integer

Crawl depth (when crawling). Omit for single-URL scrape.

Required range: 1 <= x <= 10
maxPages
integer

Maximum pages to discover during crawl.

Required range: 1 <= x <= 10000
cache
boolean

Reuse cached content within TTL. Default: true.

webhook
object

Response

Sync scrape completed

success
enum<boolean>
required
Available options:
true
data
object
required