Skip to main content
By the end of this page you’ll have scraped a real page with the self-hosted Reader library and seen clean markdown come back.

Install

npm install @vakra-dev/reader
Make sure you’re on Node 22.12.0+ and have Chrome’s system dependencies if you’re on Linux. See Installation for details.

Your first scrape

Create a file and paste this:
import { ReaderClient } from "@vakra-dev/reader";

const reader = new ReaderClient({ verbose: true });

try {
  const result = await reader.scrape({
    urls: ["https://example.com"],
    formats: ["markdown"],
  });

  const page = result.data[0];
  console.log(`Title:    ${page.metadata.website.title}`);
  console.log(`Engine:   ${page.metadata.engine}`);
  console.log(`Duration: ${page.metadata.duration}ms`);
  console.log();
  console.log(page.markdown);
} finally {
  await reader.close();
}
Run it:
node your-file.js
You’ll see Reader initialize, render the page in a headless browser, extract the main content, convert it to markdown, and print the result.

What just happened

  1. new ReaderClient(...) - creates a client. No browser yet.
  2. reader.scrape(...) - on the first call, Reader spins up a browser pool behind the scenes and runs the scrape. The pool stays alive for subsequent calls.
  3. Hero engine - Reader renders the page in headless Chrome with JavaScript execution, TLS fingerprinting, and proxy routing. If the first attempt fails, it escalates to a residential proxy automatically.
  4. Content cleaning - by default, Reader extracts only the main content, strips ads and navigation, and converts to clean markdown.
  5. reader.close() - shuts down browsers. Optional - Reader also auto-cleans on SIGTERM / SIGINT.

Try something more interesting

Scrape multiple URLs in parallel with progress tracking:
import { ReaderClient } from "@vakra-dev/reader";

const reader = new ReaderClient({
  browserPool: { size: 3 },
});

const result = await reader.scrape({
  urls: [
    "https://example.com",
    "https://example.org",
    "https://example.net",
  ],
  formats: ["markdown"],
  batchConcurrency: 2,
  onProgress: ({ completed, total, currentUrl }) => {
    console.log(`[${completed}/${total}] ${currentUrl}`);
  },
});

console.log(`
  Success: ${result.batchMetadata.successfulUrls}/${result.batchMetadata.totalUrls}
  Total:   ${result.batchMetadata.totalDuration}ms
`);

await reader.close();

Where to go next

Examples

Crawling, proxy rotation, dynamic content, and more.

Concepts: Scraping Engine

How the Hero engine and proxy escalation work.

Guides: Batch Scraping

Tune concurrency, handle errors, track progress.

API Reference

Full type reference for every option.