puppetRouter is a centralized Puppeteer job-routing service. It manages a pool of headless and headful browser nodes — home machines, permanent droplets, and ephemeral workers — and routes each job to the least-loaded available node.
Built for long-running automotive inventory scrapers that can't afford downtime or IP bans.
Each job is routed to the node with the most available capacity. Home machines are always preferred over DO workers.
When the headless pool hits 85% saturation, ephemeral DigitalOcean droplets spin up automatically and destroy themselves when idle.
Home Windows nodes always maintain at least one headful slot — scraping continues in the background even while the machine is in use.
Each completed job's exit IP is compared against the expected Spectrum static IP. Any node scraping from a DO IP is flagged immediately.
Configurable primary, secondary, and tertiary MySQL servers. getPoolWithFallback() tries each in order until one connects.
Node cards, pool utilization, job throughput, autoscaler log, VPN integrity, and runtime config toggles — all in one view.
Four steps, fully automated.
A consumer posts POST /api/jobs/route with a mode (headless or headful) and an optional URL or domain.
The router queries alive nodes, filters by mode and remaining capacity, then picks the least-loaded home node first.
The consumer receives the node's IP and port and launches Puppeteer against that target. A job row is created in the database.
When done, the consumer calls POST /api/jobs/:id/complete. The session slot is freed and duration, exit IP, and any error are logged.
puppetRouter is an internal infrastructure service. There is no public sign-up. API credentials are issued manually to authorized consumers — currently ssoScraper and select internal tooling.
If you're building a new scraper or data pipeline that needs routed Puppeteer sessions, reach out through the internal engineering channel to request access. Include your use case, expected job volume, and preferred mode (headless vs headful).