Browser Pool

For high-volume scraping, Reader manages a pool of browser instances with automatic recycling and health monitoring.

How It Works

Instead of launching a new browser for each request, Reader maintains a pool of browser instances that are reused across requests. This provides:

Faster responses - No browser startup overhead
Memory efficiency - Controlled resource usage
Reliability - Automatic health checks and recycling

Default Behavior

When using ReaderClient, a browser pool is automatically created and managed:

import { ReaderClient } from "@vakra-dev/reader";

const reader = new ReaderClient();

// Browser pool is created on first scrape
await reader.scrape({ urls: ["https://example.com"] });

// Pool stays warm for subsequent requests
await reader.scrape({ urls: ["https://example.org"] });

// Close when done
await reader.close();

Configuration

Configure the browser pool when creating ReaderClient:

const reader = new ReaderClient({
  browserPool: {
    size: 5,              // Number of browser instances
    retireAfterPages: 50, // Recycle after N page loads
    retireAfterMinutes: 15, // Recycle after N minutes
    maxQueueSize: 100,    // Max pending requests
  },
});

Pool Options

Option	Default	Description
`size`	`2`	Number of browser instances in pool
`retireAfterPages`	`100`	Recycle browser after N page loads
`retireAfterMinutes`	`30`	Recycle browser after N minutes
`maxQueueSize`	`100`	Maximum pending requests in queue

How Recycling Works

Browsers are automatically retired and replaced when they reach limits:

Page limit - After loading N pages, browser is recycled
Age limit - After N minutes, browser is recycled
Health check failure - If browser becomes unresponsive

This prevents memory leaks and ensures consistent performance.

Queue Management

When all browsers are busy, requests are queued:

Browser 1: [Request A] ────────────────────────>
Browser 2: [Request B] ──────────>
                                 [Request C] ───>
Queue:     [Request D, Request E, ...]

If the queue exceeds maxQueueSize, new requests will fail with an error.

Daemon Mode

For CLI usage, run a daemon to keep the browser pool warm across multiple commands:

# Start daemon with custom pool size
npx reader start --pool-size 5

# All subsequent commands use the warm pool
npx reader scrape https://example.com
npx reader scrape https://example.org

# Check daemon status
npx reader status

# Stop daemon
npx reader stop

Advanced: Direct Pool Usage

For advanced use cases, you can use the browser pool directly:

import { BrowserPool } from "@vakra-dev/reader";

const pool = new BrowserPool({ size: 5 });
await pool.initialize();

// Use withBrowser for automatic acquire/release
const title = await pool.withBrowser(async (hero) => {
  await hero.goto("https://example.com");
  return await hero.document.title;
});

// Check pool health
const health = await pool.healthCheck();
console.log(`Pool healthy: ${health.healthy}`);

await pool.shutdown();

Best Practices

For High Volume

const reader = new ReaderClient({
  browserPool: {
    size: 10,             // More browsers for concurrency
    retireAfterPages: 25, // Recycle more frequently
    retireAfterMinutes: 10,
  },
});

For Long-Running Services

const reader = new ReaderClient({
  browserPool: {
    size: 3,
    retireAfterPages: 100,
    retireAfterMinutes: 30,
    maxQueueSize: 200, // Larger queue for bursts
  },
});

For Memory-Constrained Environments

const reader = new ReaderClient({
  browserPool: {
    size: 1,              // Single browser
    retireAfterPages: 10, // Frequent recycling
    retireAfterMinutes: 5,
  },
});

Documentation

Concepts

Guides

How It Works

Default Behavior

Configuration

Pool Options

How Recycling Works

Queue Management

Daemon Mode

Advanced: Direct Pool Usage

Best Practices

For High Volume

For Long-Running Services

For Memory-Constrained Environments

Next Steps

Batch Scraping

Deployment

Documentation

Concepts

Guides

​How It Works

​Default Behavior

​Configuration

​Pool Options

​How Recycling Works

​Queue Management

​Daemon Mode

​Advanced: Direct Pool Usage

​Best Practices

​For High Volume

​For Long-Running Services

​For Memory-Constrained Environments

​Next Steps

Batch Scraping

Deployment

How It Works

Default Behavior

Configuration

Pool Options

How Recycling Works

Queue Management

Daemon Mode

Advanced: Direct Pool Usage

Best Practices

For High Volume

For Long-Running Services

For Memory-Constrained Environments

Next Steps