The basic request
maxDepth levels deep, and stop when it hits maxPages.
Depth vs pages
Both limits matter:maxDepth: how many link-hops away from the seed Reader will explore.maxDepth: 1means “seed + all pages linked directly from the seed”.maxDepth: 2includes pages linked from those, and so on.maxPages: total pages to scrape, regardless of depth. Acts as a safety cap.
maxDepth: 3 and maxPages: 200 is a reasonable starting point.
Same-host only
Crawls stay on the seed URL’s host. A seed ofhttps://docs.example.com follows links within docs.example.com but ignores links to example.com, blog.example.com, or anywhere else. If you need cross-host discovery, you’ll have to seed each host separately.
Watching progress
Crawls can take a while. Use SSE or polling to track:total changes as Reader discovers more pages. The final number is only known once the crawl finishes.
When to crawl vs batch-scrape
- Crawl when the site doesn’t expose a sitemap, or you want every reachable page from a starting point.
- Batch-scrape when you can get a URL list some other way (sitemap.xml, API, RSS). It’s cheaper, faster, and gives you exact control over what gets fetched.
Cost
Crawls bill 1 credit per page discovered and scraped, flat. A 500-page crawl = 500 credits.Next
- Crawl with scrape: content extraction on every crawled page
- Scrape vs crawl

