The per-job stream
stream() method wraps the raw SSE protocol as an async generator. Events you’ll see:
progress: periodic status updates withcompleted/totalpage: one per page as it finishes (full scrape result with metadata)error: one per failed pagedone: final event, stream closes
Raw SSE from fetch
If you want to consume the stream without the SDK:When SSE beats polling
- Real-time UI. You’re showing the user a live progress bar; SSE gives you immediate updates without polling overhead.
- One long job. For a 500-page crawl that takes 10 minutes, SSE is one connection instead of 300 poll requests.
- You want per-page results as they finish. The
pageevent delivers each result in real time, useful for streaming-style pipelines.
When SSE doesn’t fit
- Offline / batch processing. If your worker goes down mid-stream, you lose progress visibility. Use webhooks.
- Many jobs in parallel. Opening 50 concurrent SSE connections is a lot of open TCP connections. Use the workspace-wide stream or webhooks instead.
- Serverless functions with short timeouts. SSE wants a long-lived connection; serverless wants short invocations. Webhooks are the better fit there.
Keep-alives
Reader sends a comment line (: ping\n\n) every 30 seconds to keep intermediate proxies from dropping idle connections. Most SSE parsers ignore comments by default; if yours doesn’t, filter them out.
Next
- Progress bars: SSE driving a UI
- Polling vs SSE vs webhooks

