Documentation Index
Fetch the complete documentation index at: https://docs.reader.dev/llms.txt
Use this file to discover all available pages before exploring further.
Browser Sessions Guide
This guide shows how to use Reader’s browser() primitive for full browser automation with anti-bot stealth.
Playwright
import { ReaderClient } from "@vakra-dev/reader";
import { chromium } from "playwright-core";
const reader = new ReaderClient();
const session = await reader.browser();
const browser = await chromium.connectOverCDP(session.wsEndpoint);
const context = await browser.newContext();
const page = await context.newPage();
// Navigate and extract data
await page.goto("https://news.ycombinator.com/");
const stories = await page.evaluate(() =>
Array.from(document.querySelectorAll(".athing")).slice(0, 5).map((row) => ({
title: row.querySelector(".titleline > a")?.textContent,
}))
);
console.log(stories);
// Screenshots and PDFs
await page.screenshot({ path: "page.png", fullPage: true });
// Cleanup
await browser.close();
await session.close();
await reader.close();
Install: npm install playwright-core
Puppeteer
import { ReaderClient } from "@vakra-dev/reader";
import { connect } from "puppeteer-core";
const reader = new ReaderClient();
const session = await reader.browser();
const browser = await connect({
browserWSEndpoint: session.wsEndpoint,
defaultViewport: null,
});
const page = await browser.newPage();
await page.goto("https://example.com");
console.log(await page.title());
await browser.close();
await session.close();
await reader.close();
Install: npm install puppeteer-core
Raw CDP (Any Language)
For tools that speak the Chrome DevTools Protocol directly:
import WebSocket from "ws";
const reader = new ReaderClient();
const session = await reader.browser();
const ws = new WebSocket(session.wsEndpoint);
await new Promise((resolve) => ws.on("open", resolve));
let cmdId = 0;
function send(method, params = {}, sessionId) {
const id = ++cmdId;
return new Promise((resolve) => {
const handler = (data) => {
const msg = JSON.parse(data.toString());
if (msg.id === id) {
ws.off("message", handler);
resolve(msg.result);
}
};
ws.on("message", handler);
ws.send(JSON.stringify({ id, method, params, ...(sessionId && { sessionId }) }));
});
}
// Create a page and navigate
const target = await send("Target.createTarget", { url: "about:blank" });
const attached = await send("Target.attachToTarget", {
targetId: target.targetId,
flatten: true,
});
await send("Page.enable", {}, attached.sessionId);
await send("Page.navigate", { url: "https://example.com" }, attached.sessionId);
// Wait for load, then get title
const title = await send("Runtime.evaluate", {
expression: "document.title",
}, attached.sessionId);
console.log(title.result.value);
ws.close();
await session.close();
await reader.close();
Interactive Actions
Click, type, search, and extract results:
const session = await reader.browser();
const browser = await chromium.connectOverCDP(session.wsEndpoint);
const page = await (await browser.newContext()).newPage();
// Navigate
await page.goto("https://hn.algolia.com/");
// Type a search query
await page.locator('input[type="search"]').pressSequentially("web scraping", { delay: 50 });
await page.waitForTimeout(3000); // Wait for instant search
// Extract results
const results = await page.evaluate(() =>
Array.from(document.querySelectorAll(".Story")).slice(0, 5).map((el) => ({
title: el.querySelector(".Story_title a")?.textContent?.trim(),
}))
);
console.log(results);
await session.close();
CLI
# Create a session (standalone mode)
reader browser create --standalone --show-chrome
# With daemon running
reader browser create
reader browser list
reader browser stop <sessionId>
The create command prints a JSON object with sessionId and wsEndpoint, then blocks until Ctrl+C or timeout.
With Proxy
const session = await reader.browser({
proxy: { url: "http://user:pass@proxy:8080" },
});
Or use the configured proxy pool:
const reader = new ReaderClient({
proxyPools: {
datacenter: [{ url: "http://dc-proxy:8080" }],
},
});
const session = await reader.browser({ proxyTier: "datacenter" });
Notes
- Each session uses ~300MB memory (separate Chrome process)
- Default timeout: 5 minutes (
timeoutMs: 300000)
- Sessions are isolated from the scrape/crawl browser pool
- Always call
session.close() when done to release resources