Skip to main content
For high-volume scraping, Reader manages a pool of browser instances with automatic recycling and health monitoring.

How It Works

Instead of launching a new browser for each request, Reader maintains a pool of browser instances that are reused across requests. This provides:
  • Faster responses - No browser startup overhead
  • Memory efficiency - Controlled resource usage
  • Reliability - Automatic health checks and recycling

Default Behavior

When using ReaderClient, a browser pool is automatically created and managed:
import { ReaderClient } from "@vakra-dev/reader";

const reader = new ReaderClient();

// Browser pool is created on first scrape
await reader.scrape({ urls: ["https://example.com"] });

// Pool stays warm for subsequent requests
await reader.scrape({ urls: ["https://example.org"] });

// Close when done
await reader.close();

Configuration

Configure the browser pool when creating ReaderClient:
const reader = new ReaderClient({
  browserPool: {
    size: 5,              // Number of browser instances
    retireAfterPages: 50, // Recycle after N page loads
    retireAfterMinutes: 15, // Recycle after N minutes
    maxQueueSize: 100,    // Max pending requests
  },
});

Pool Options

OptionDefaultDescription
size2Number of browser instances in pool
retireAfterPages100Recycle browser after N page loads
retireAfterMinutes30Recycle browser after N minutes
maxQueueSize100Maximum pending requests in queue

How Recycling Works

Browsers are automatically retired and replaced when they reach limits:
  1. Page limit - After loading N pages, browser is recycled
  2. Age limit - After N minutes, browser is recycled
  3. Health check failure - If browser becomes unresponsive
This prevents memory leaks and ensures consistent performance.

Queue Management

When all browsers are busy, requests are queued:
Browser 1: [Request A] ────────────────────────>
Browser 2: [Request B] ──────────>
                                 [Request C] ───>
Queue:     [Request D, Request E, ...]
If the queue exceeds maxQueueSize, new requests will fail with an error.

Daemon Mode

For CLI usage, run a daemon to keep the browser pool warm across multiple commands:
# Start daemon with custom pool size
npx reader start --pool-size 5

# All subsequent commands use the warm pool
npx reader scrape https://example.com
npx reader scrape https://example.org

# Check daemon status
npx reader status

# Stop daemon
npx reader stop

Advanced: Direct Pool Usage

For advanced use cases, you can use the browser pool directly:
import { BrowserPool } from "@vakra-dev/reader";

const pool = new BrowserPool({ size: 5 });
await pool.initialize();

// Use withBrowser for automatic acquire/release
const title = await pool.withBrowser(async (hero) => {
  await hero.goto("https://example.com");
  return await hero.document.title;
});

// Check pool health
const health = await pool.healthCheck();
console.log(`Pool healthy: ${health.healthy}`);

await pool.shutdown();

Best Practices

For High Volume

const reader = new ReaderClient({
  browserPool: {
    size: 10,             // More browsers for concurrency
    retireAfterPages: 25, // Recycle more frequently
    retireAfterMinutes: 10,
  },
});

For Long-Running Services

const reader = new ReaderClient({
  browserPool: {
    size: 3,
    retireAfterPages: 100,
    retireAfterMinutes: 30,
    maxQueueSize: 200, // Larger queue for bursts
  },
});

For Memory-Constrained Environments

const reader = new ReaderClient({
  browserPool: {
    size: 1,              // Single browser
    retireAfterPages: 10, // Frequent recycling
    retireAfterMinutes: 5,
  },
});

Next Steps